Links & Resources
Click to toggle the visibility of the transcript
Captions provided by White Coat Captioning (https://whitecoatcaptioning.com/). Communication Access Realtime Translation (CART) is provided in order to facilitate communication accessibility and may not be a totally verbatim record of the proceedings.
JASON LENGSTORF: Hello, everybody. And welcome to another episode of Learn with Jason. Today on the show, we have another Jason, Jason Mayes. Welcome to the show, thanks so much for joining us.
JASON MAYES: Yeah, thanks for having me today. Great to have double Jasons on the show. Excited to see what happens today.
JASON LENGSTORF: You remember Double mint gum when they did the commercials with twins. Double the pleasure, double the fun. Now it's double the Jason, double the fun. So for those of us that aren't familiar with your work, do you want to give a little bit of a background?
JASON LENGSTORF: I'm going to ask you some questions about this, because I find it so fascinating. This is a term I hadn't really heard until recently. There was another developer, Elrick Ryan, Creative Engineer was his title.
JASON MAYES: Yeah.
JASON LENGSTORF: So this is a role I think a lot of people probably aren't aware of. Effectively, your job is to get weird with code. Is that a good --
JASON LENGSTORF: Yeah, yeah.
JASON MAYES: That's where I am right now, essentially.
JASON LENGSTORF: Very cool, yeah. I am really, really excited to see how this goes, because I feel like there is just kind of unlimited potential here. You know, we've done a couple episodes of machine learning on the show before. We've had Gant LaBord come on, we've put masks on people, found faces and figured out how to calculate angle and size and all that stuff, so we could put a mask, so the eyes actually showed up over their eyes. Really kind of fun things that we could do. And so what I think -- and then I just did a show with Cassidy Williams, I think it was last week. Time is both very fast and very slow right now. I would believe you if you said it was last week, or three years ago. So we just did an episode with facial recognition using face API.js, where we were reading the emotion. And so we could make a face at the camera, and it would show an emoji to show, like, how we felt. You know, it was let's see what we could do knowing nothing about this software. And so it's very, very interesting to me to see all the different ways that you can use this stuff. Because, you know, it seems to range from we can do things with images, like find faces, and you can learn that for anomimization, or do what we did, do something silly, put a mask on somebody.
JASON LENGSTORF: Yeah, I really do love that whole kind of artistic approach. That you can do -- especially when you hear something like machine learning, or you start to talk about artificial intelligence. The first thing that you think is very -- it sounds like science. It sounds like something that's very rigid, and very exacting, and very mathematical. So when you see the ways that people are using it to just be silly, I feel like I love that. I love that kind of combination of art and science.
JASON MAYES: Totally. And I always try and use a model in the way it wasn't designed to be used. I don't know if you've got time to show a few demos I created if you go to my Twitter or something.
JASON LENGSTORF: Yeah, let me switch over to this view so we can do it. So here's your Twitter. Where do you want me to -- uh-oh.
JASON MAYES: Go to the home, my profile page. I've got one pinned tweet, which is right at the very top of the Twitter account.
JASON LENGSTORF: Uh-oh, I have problems. Let me quit everything that's not --
JASON MAYES: Pushing the Mac to its limits.
JASON LENGSTORF: I really am. Putting a hurting on it. Okay. Let's open another window here, and I'm going to store these pages. Good, that's what I wanted. Let's get these ones in here.
JASON MAYES: Fingers crossed. There we go, good stuff. Scroll down to the first pinned post at the top there. This is the thing that got a lot of attention when I first joined the team. I need to learn how to use this stuff. What I managed to do is make myself invisible in realtime. This is happening all live in the web browser. You can see when I get on the bed, for example, the bed still forms in the bottom image and I'm completely removed from the scene. Not just a cheap effect where you're pausing the background and, hey, we're done. You could do that with a single image without machine learning. This is a little bit more advanced, and that is doing it in realtime. The other thing I'd like to show, if you could go to YouTube, type my name and clothing size estimation, or something like that. It should pop up as the result.
JASON LENGSTORF: This one?
JASON MAYES: Yeah, yeah, that's the one. You might want to mute it and scroll ahead to where I'm actually -- that beautiful music there. Basically, in under 15 seconds, I can estimate the various proportions of my body and then figure out what size clothing I should be buying on those random brands' websites. I don't know about you, but I'm terrible at buying clothes. I don't know what size I am and all that kind of stuff. Oftentimes I start with a problem I have in real life and deal with it with the technologies I have available to me. There we go. Click on the bottom. I enter my height, and then it will use body segmentation, which is the ability to detect which pixels in the image belong to my body, versus the background, and with that I can figure out different waist measurements, inner leg, chest, back, that kind of stuff. And I do a side profile and a front shot, and, boom, it gives me my results and I can do what I need to do, which is kind of nice.
JASON LENGSTORF: That's slick. Well, that's super cool.
JASON MAYES: I like making little fun examples like this. That's basically my role. I get to make these cool prototypes and talk about them and, hopefully, inspire others to make some cool stuff, too.
JASON LENGSTORF: Very cool. Yeah, yeah, I'm sharing those links in the chat. They'll be in the show notes, as well. Now that we're switched over, before we move to the next thing, we have live captioning available on the show. It is made possible by White Coat Captioning. Thank you so much for being here today. We have Ashly helping us out, so thank you very much, Ashly. And that is made possible through our sponsors. Netlify, Fauna, Sanity, and AuthO all kicking in to make this show more accessible to more people. We really appreciate it. Thank you very much. So with that being said, if you want to see the live captioning, they are at lwj.dev/live. You can see it happening here. And let's bounce over. Let's see. We looked at this example. This is really cool, this active camera flow. That's the dream. This is in the '90s, they would make people invisible through active camouflage.
JASON MAYES: Right, sci-fi is a big inspiration. Totally. We should maybe just switch to the community, as well. If you put #madewithtfjs. You can see what the community has made, as well. That looks correct. There we go. Go to -- you can see people making really beautiful artistic visualizations. This was a self-reflection piece, where you walk up to the cam and see your face being made in web gl particle effects and this cool stuff. Cool artistic installation. This is super cool, can turn any character into an animation controlled by your very own body using our face mesh and pose estimation models. So you can see the lady in the top left moving around. Estimating her skeleton and everything else. All running in the web browser and she can now join a Zoom meeting or Google Meet as a character. You can have privacy working at home, if you have lots of people in the room, you can turn yourself into a character and have more privacy.
JASON LENGSTORF: These are really cool.
JASON MAYES: That's just the first two. It just keeps on going. So much cool stuff happening.
JASON MAYES: Yeah, so, essentially, it's an open-source machine learning library, which basically means it's got all the functions you would ever need to make anything you could ever dream of. Think of Legos of machine learning, of individual building blocks that you can put together in whatever way you wish to make amazing things. And that could be things like object detection. It could be natural language processing, it could be whatever you dream up. Even degenerative stuff, networks that can make images from various training data and so on and so forth. So all of it is possible using this library. And the original TensorFlow has two flavors of the library. There's like the low-level mathematical building blocks, and then there's the high-level APIs, as well, which in Python is KRS. And in TensorFlow.js we call it the layers API, but it's basically the same as QRS and gives you the high-level building blocks to make things, if you so desire. But, yeah, it's using the library still requires some level of machine learning knowledge. Not something you can just jump into without some understanding of machine learning. That's why with TensorFlow.js, we, obviously, have the library. However, we've also made some pre-made models that are super easy to use and get started with to experiment with machine learning, even if you have zero knowledge in machine learning. This is a great starting point. Here's some of the models we have available. Image classification, object detection. To tell the difference, image is a binary yes or no, whereas object detection gives you the bounding box, as well. Bit more advanced.
JASON LENGSTORF: In which classification would be that joke from Silicon Valley, hot dog, not hot dog, yes or no. And object detection would be more like, this is a photo, and it would say in this photo I see a book and a person and a chair.
JASON MAYES: Or show you where the hot dog is and how many hot dogs there are, if there's several in the image. Where the other, I don't know how many there are, but there's definitely a hot dog in there somewhere. Yeah, exactly. Then if you want more power, you have the body segmentation. We can generalize as image segmentation. If you want to take the hot dog and know the pixels of the hot dog.
JASON LENGSTORF: This what you use for the size estimation demo?
JASON MAYES: Correct, exactly. Body segmentation model.
JASON LENGSTORF: That was for the demo we looked at?
JASON MAYES: Yes. Pose estimation tells you where your skeleton is, essentially. You could do gesture recognition or something like this, potentially, by lowering your skeleton over time, do some fun effects with that.
JASON LENGSTORF: Very cool. There's a lot out here. This is very cool.
JASON MAYES: Yeah.
JASON LENGSTORF: All right, awesome. So what we were thinking, and I pitched kind of a silly idea, right, and I hope you're on board with it. But what I wanted to do, is I wanted to use the object detection to make like a secret password. So in order to get to a page in a website, I want you to have to show the camera a sequence of objects. And so I picked up a couple things from around my house. I found a banana.
JASON MAYES: Always good for scale.
JASON LENGSTORF: I found a fork. And I found a book. And so I want my website to be protected by a visual password, where I have to show in a given sequence a banana, a fork, and a book.
JASON MAYES: Sounds good, yeah.
JASON MAYES: Yep, looks good to me.
JASON LENGSTORF: So how would you like to start?
JASON LENGSTORF: This is a huge image. This was a mistake. Let me go to --
JASON MAYES: Power of your machine here.
JASON LENGSTORF: Let's take it down to, like, 800 --
JASON MAYES: 480x480.
JASON LENGSTORF: Doesn't need to be a masterpiece. That's 99% smaller. That seems correct. Let's do that.
JASON MAYES: That was a large image. I think you'll be pushing your computer to the limits there. Yeah, this is a much smaller image. Now I can copy this.
JASON MAYES: Shove that in our page somewhere in the body. Yep.
JASON LENGSTORF: Banana. Okay. And then object classification with TensorFlow.js. And then, yeah, let's leave that Glitch button on that. Sounds nice.
JASON MAYES: Perfect.
JASON LENGSTORF: Let's also look at the --
JASON MAYES: Yeah, gives you a live preview there. That's always good to see.
JASON LENGSTORF: All right. I'm making the text bigger, because it's not very readable. Let's see. I might do this in two windows just for the sake of making that easier to read. So let's take this dial, and I'm just going to make the image -- we'll go max width 90. Good. Good. Yeah, good enough.
JASON MAYES: Perfect.
JASON LENGSTORF: Let's go 100%. I think the body will take care of that. There we go. Now we've got an image.
JASON LENGSTORF: Let's walk through this line by line here, just to make sure we understand what's happening. We're getting our image using regular browser. That's not right. We want a query selector.
JASON MAYES: Only one image.
JASON LENGSTORF: Also no ID. Query select the image, then this cocoSd model is the one that came out of here.
JASON MAYES: Correct, that's right. We've imported that at the top there.
JASON LENGSTORF: Yeah. So we get the cocoSd, we load it, and this is a promise so we don't have to do callbacks or whatever. And then the return value is the cocoSd model, and using the model we run a detect function on the image. And then we get back our predictions, and our predictions, if I'm understanding correctly, those would be what's in this image, where is it, what are the bounding boxes.
JASON MAYES: Exactly. Basically, it's just a JSON object that basically contains the bounding box and the class of the thing it thinks it's seen, its probability, if it thinks it's accurate, that kind of stuff. We'll print this out and see the raw stuff coming back in just a second to the console. So we can then see what's going on.
JASON LENGSTORF: Painted canvases?
JASON MAYES: Oh, yes. So, basically, here's a fun thing about cause. If you go back to the HTML here, you need to add the cross-origin attribute to the image.
JASON LENGSTORF: Oh. Is it dash?
JASON MAYES: I think it's one word, just crossorigin, lower case, if I remember correctly. Let me just double check that one second.
JASON LENGSTORF: There we go.
JASON MAYES: Got it? Okay, nice.
JASON LENGSTORF: Look at that, it thinks it's a banana! And it is 97% sure that it is a banana.
JASON MAYES: Awesome. You can see in there, there's also kind of another object that gives it the bounding box. There's a little thing you can expand there at the top called BBox.
JASON LENGSTORF: Not for beat boxing.
JASON MAYES: Maybe we could do that if we're successful today, have a go at beat boxing. It's got all the X/Y coordinates to draw a rectangle around that, if you so desire. All that good stuff.
JASON LENGSTORF: Nice. That's, like -- that's pretty simple. That feels pretty good to use. It's approachable, it didn't require us to do a whole bunch of setup. We just load the models. The abstraction here is we have no idea how it's classifying these images. This is where it can get murky, you're kind of trusting somebody else's data and hoping for the best.
JASON MAYES: Yeah.
JASON LENGSTORF: I think the "C" in CORS stands for "cry." That's correct. CORS is cross-origin resource sharing. Also, thank you, Thomas, for the sub. I saw that earlier and wanted to shout it out. Okay, so from here, what do we do next?
JASON MAYES: Right. So we've got the ability to recognize objects. We should probably just clarify at this point the cocoSd model has been trained on 90 common objects, things like bananas, mobile phones, things people probably have around the house. And, of course, if you wanted to train your own object, we can talk about that later. There's ways to do that. For now, we're stuck with these 90 objects it knows how to classify. Hopefully, it's going to recognize our books, forks, and bananas. Hopefully, things it will be able to recognize. So next thing we need to do, I guess, is instead of using an image, we need to access the web cam and get some live data coming back. You can actually call model predict many, many times. And once the model is loaded, you can call it as many times as you'd like. We can just continuously do that on the web cam stream instead.
JASON LENGSTORF: Nice, okay.
JASON MAYES: I probably recommend at this point -- there's no code here to show you how to do that. That, of course, is web dev stuff, getting access to the web cam. So what we probably want to do here is add a video element into the HTML, which is where we can then get the user media stuff to render to later on. May be quite a video tag and call it, I don't know, we have an idea of web cam or something like this, access it easily later.
JASON LENGSTORF: Okay.
JASON LENGSTORF: Yeah.
JASON MAYES: Yeah, yeah.
JASON LENGSTORF: Okay, so I'm going to --
JASON LENGSTORF: Perfect. Okay. So that also means, I think -- let's see, that's deferred. Do we need to defer all of these instead of putting them at the bottom? I guess we'll find out. Let's put that here, and then I'm going to come back out here.
JASON MAYES: Might need to add another tag at the bottom to include the script.js.
JASON LENGSTORF: When you defer, it waits until the whole DOM is loaded, and since we didn't defer, it will be loaded before the defer script. We're doing weird stuff. Technically, we'd want to defer the TensorFlow scripts. In fact, maybe we should do that, just for the sake of not confusing anyone later. We can put these up here, and we can just mark them as deferred.
JASON MAYES: As long as the TensorFlow stuff gets loaded in order and before the script.js, then we're good. Otherwise, you're going to get errors that TensorFlow is not a thing, and the world will end.
JASON LENGSTORF: We're in good shape here. Now what we're doing is a deferred load of everything, starting with TensorFlow, then cocoSd, then our script. Here we are in our script. I'm actually going to drop this image out.
JASON MAYES: We don't need it anymore. That was in our test.
JASON LENGSTORF: In our script, all of this will break if we don't comment it out, because there is no image to test.
JASON MAYES: Correct, for now, yes. We'll get back to that bit in just a little bit. Let's add the web cam in first, I guess. Maybe we just want to style that video tag so it's got width and height so we can see it on the page and we can see stuff rendered to it, or something.
JASON LENGSTORF: Okay. Let's give web cam a display.
JASON MAYES: Maybe 640x480, that's a standard web cam resolution. That's good enough.
JASON LENGSTORF: Aspect ratio doesn't work yet, does it?
JASON MAYES: That's a good question.
JASON LENGSTORF: We'll go height zero. We need a container on this. So we'll do like a div class, responsive, embed. And then we can put the video inside of that. And our responsive embed is going to -- responsive-embed. It's already display block, but we'll make it position: Relative. We'll make it a width of 100%, a height of 0, and a padding-bottom of -- 16 divided by 9. No, 56.25%. The background red, so we can for sure make sure that worked. There it is. My favorite CSS debugging technique is always --
JASON MAYES: Always good to see how others do their CSS stuff. Make sure everything works correctly.
JASON LENGSTORF: Top, left, right, and bottom of zero. And that will give us -- I think we can set a background of blue on this one. And that should give us -- what's going on? Did I typo something?
JASON MAYES: There we go, okay. I think you need to have the parent to have a position, as well, in order for the child to respect, I think.
JASON LENGSTORF: That's positioned.
JASON MAYES: There it is. Sorry, my eyes are not working together.
JASON LENGSTORF: What aren't you -- that's odd. Whatever. We'll just use that. So we'll go top, left, and then we'll go width: 100%. Height: 100%. Not sure why one is working and the other wouldn't. But I'm sure it's something silly the chat will point out. Okay, excellent. Good, good, good. So we are ready with a bounding box. We don't have anything on it. Maybe we can give it like a border, so that we can see it. There we go. That's where our web cam should be.
JASON MAYES: Cool. Next thing to do then would be to get access to the web cam using user media.
JASON LENGSTORF: Okay.
JASON MAYES: So if I remember correctly, essentially, it's navigator.mediadevices.getusermedia, camel cased. So we need to make it object above that with the constraints. So make a constraints object with a property of video set to true. That will get the video only. Don't need the audio. Yeah, cool. I think that will get the --
JASON LENGSTORF: Okay, so that's allowing --
JASON MAYES: That won't do anything just yet. What we need to do is use a .then. That's going to pass us a stream object back. So to whatever function we path there. And then we can set the video to the stream that gets passed back, essentially. Yeah, get reference to our video object, and then we set it SRC object.
JASON LENGSTORF: Like that?
JASON MAYES: Src, actually, the word object at the end of "src." Capital "O." Cool. Make that equal to the stream that got passed back. That looks good. Now we just need to enable it. Is that in a function, or -- no --
JASON LENGSTORF: Should just run. Tells me my web cam is on. We are https. Let's check for errors. What did we get?
JASON MAYES: Okay, no errors. Okay.
JASON LENGSTORF: Check the video.
JASON MAYES: Your web cam is available in the Chrome thing at the top right, right?
JASON LENGSTORF: I'm going to check the getUserMedia, just to make sure. We get a stream, get user media, that all worked. With our stream, we need to put it somewhere. Video, true, good, yeah. Okay, but show me how to put it in a video tag. You don't have a single demo? Come on. Examples, here we go. Video, source object is mediaStream. Oh, we got to tell it to play.
JASON MAYES: Oh, of course.
JASON LENGSTORF: I think we actually might be able to --
JASON MAYES: Autoplay on the tag.
JASON LENGSTORF: I think that works. Yeah, look at it go. I was supposed to turn this thing on. What kind of production am I running here? Look at that.
JASON MAYES: Got all the lights.
JASON LENGSTORF: Now you can see this side of my face, which was before lost to the shadows. All right. So now we'll be able to -- so this is the stream, and then this is my computer. So I'll be showing to the computer.
JASON MAYES: Bananas like this, exactly.
JASON LENGSTORF: We'll basically be playing this game. Okay, good.
JASON MAYES: Now, that is the good question. So just to prevent any issues later, I recommend also when we add the -- when we set the source object, we should add in event listener for when the video is loaded. I think loaded data is the event we should be listening for. And that's what we can use to then start telling our machine learning to predict the web cam results. So that way we know the video stream is working, because I've had problems in the past where you start trying to predict things on a blank video stream, and basically everything blows up and it's not so good.
JASON LENGSTORF: Yeah, and that was, what, on -- oh --
JASON MAYES: Loaded data, all lower case, one word. So add event listener.
JASON LENGSTORF: Add event listener.
JASON MAYES: First parameter, string data. Then a function after that. This is going to be where we're going to start activating our machine learning kind of prediction loop, if you will. Whatever that will be.
JASON LENGSTORF: Do I just want to get straight into this thing, or?
JASON MAYES: Yeah, we can probably copy the cocoSsd stuff we had from before. The load stuff, we should have loaded before, actually. Okay, we can do it here. Let's do it here, let's do it here. That's fine, yeah, cool. Exactly what you had before, paste it in there for now. Once it's loaded, we'll call our animation loop or something, which can be a separate function.
JASON LENGSTORF: In this case, am I running it against the stream, or the video, or what am I running it against?
JASON MAYES: It will be the video tag itself. Get element by ID or something like that.
JASON LENGSTORF: I abstracted it out here so I could use it again. So get that, then get predictions.
JASON MAYES: Correct. Yeah.
JASON LENGSTORF: Okay. So that should give us our predictions. Good.
JASON MAYES: Fingers crossed.
JASON LENGSTORF: Excellent.
JASON MAYES: If we run that now, it would probably say person, because it doesn't recognize people. It would probably see you for now. See what pops up.
JASON LENGSTORF: It does. It's 86% sure I am a person. The other 14% is, what, bear?
JASON MAYES: Who knows. Other things in there, too.
JASON LENGSTORF: Okay. So now we need to set this up -- 14% beard. That was the right answer. So I want this to not yell at me. That should still work. It does, it doesn't? There it is.
JASON MAYES: There it is, yeah. Takes a few seconds, depending on your Internet connection to load the model and refresh. Basically, good stuff. Yeah, we've done it for one frame, so we need to make an animation loop. Now that we know we can call model.detect in that area, what we should do is create a separate function, named function, that we can call repeatedly on the web cam, and in there, we'll just take our model.detect code. And instead of calling model.detect for load, we'll put that above and call this function once it's ready. Yeah, perfect. So that should meet the same result. But now, of course, when we finish a prediction under detect, we can call detect objects again, essentially.
JASON LENGSTORF: So what you're saying is window.requestAnimationFrame, and we would just run detect objects model, like this.
JASON MAYES: Sure, request animation frame should be in the looped one above. The first can be a regular call, but the request animation frame should be in the function above, which will be looping forever.
JASON LENGSTORF: Oh, I understand. You want it to call recursively. I get it. Yep, that works for me. So we need to also pass in the video. Good. Okay. So that gives us our loop, and we should be getting --
JASON MAYES: Theoretically, we should be getting a whole bunch of stuff coming out once the model loads. What's that say?
JASON LENGSTORF: It dumped our -- oh, that's just me not doing it right.
JASON MAYES: Oh, here we go.
JASON LENGSTORF: So this should just give us a very fast stream of --
JASON MAYES: Exactly. Boom, there we go.
JASON LENGSTORF: So each of these is going to be, right now, like a person or whatever. Then check this out, watch. My most recent one. There's banana! Now it's gone. So let's grab one from down here. And it will say person. So as we move things in and out of the camera, it should change what we see, which is pretty awesome.
JASON MAYES: Exactly. Multiple objects or an array of things. So what might be useful at this point just for debugging purposes is maybe write out to a paragraph tag somewhere else on the page what we're seeing, instead of having to look at the console all the time. Then we can in realtime see what's coming back.
JASON LENGSTORF: So we're not actually using this, so I'm just going to drop it. And let's instead show the current model. And so what I'm going to do instead here, is I'm going to document.querySelector. We should call that current predictions. And we can set the inner text to be JSON.stringify predictions, null 2. What I think that will do is replace everything inside of it. We'll call this "current predictions." And what we should see -- hopefully.
JASON MAYES: Where is it?
JASON LENGSTORF: There we go. So here's our ongoing classification.
JASON MAYES: Beautiful.
JASON LENGSTORF: Showed a banana. Still sees a person in the background, if I hide my face, it will get rid of me. Let's show it a fork. Can you get fork?
JASON MAYES: Maybe a little closer.
JASON LENGSTORF: Fork. Then if I showed a book. This is a great book, by the way. It's on the history of punch. And this is book. It's like 50% sure this is a book. Close enough to get us there.
JASON MAYES: Good enough. That's the other thing, as well. The score it gives back, we as programmers can decide if we're going to allow a certain threshold or not. That's our choice. So if we are trying to recognize a cat, maybe making a pet feeder, we're 100% sure it's a cat, then we're going to give it treat, otherwise we're not. We can decide the threshold.
JASON LENGSTORF: Super cool. Welcome, AuthO. AuthO just showed up with friends. What is up, what were you all working on today? What we are working on is we're using TensorFlow.js to build a visual password. We're going to protect some content by requiring people to show some objects. And so I did, for funzies, I did some drawings of the things that I wanted us to look at. So here are some drawings of a banana, a fork, and a book, that we can use for this demo. And so I was thinking would be kind of fun is, like, I think we can do this pretty quick and dirty. We can show the password that we want, and then we can do this in random order.
JASON MAYES: Random.
JASON LENGSTORF: Right, so we'll shuffle the array and have those three images, then show them in whatever order. In order to get to the next, you have to show the items in sequence. If I'm thinking about this correctly, just to sort of pseudo code this out. Let's see, we're done down here. To pseudo code this out, we need to, one, show the three images in a random order. Two, we need to tell -- or know which image is next in the array.
JASON MAYES: Yep.
JASON LENGSTORF: Let's see. Mark an image, or mark an object as "seen," when it's detected. And then show secret content once all three images or objects are detected. Does that seem --
JASON MAYES: That's, basically, the gist of it I think. Basically, what I'll add is if we show an incorrect object, it will reset itself. Otherwise we can brute force. Until it gets it right.
JASON LENGSTORF: For this demo, given that we are somewhat haphazard with the order of things and we're getting multiple objects back and stuff, I'm a little worried that the reset is going to get us in trouble. I think in this case we should allow brute forcing.
JASON MAYES: Okay. Make it more secure.
JASON LENGSTORF: Yes. If you were going to take this further, you would want some kind of a, like, brute force detection, right? Multiple wrong attempts would need to reset or something. The problem is, if I pick up a banana and put it down and detects person, it would reset the whole thing.
JASON MAYES: We can allow person. Maybe person is the -- we also need to detect -- because remember how fast it runs, 30 frames a second here. If you hold a banana for a split second, it's going to count three bananas instantaneously. We want to recognize an object that's not a person, and then we want to specify a state in the system to look for only a person. That means a person put the banana away, and then the person can present the next object. I think we need a state system going on, too, to make this -- well, running so fast. If it wants a second, human can deal with it. Because you're running so fast, we need to have the state system in there, too. Let's see how it goes. We can get to that as we go along, I guess.
JASON LENGSTORF: What I'm going to do for now is get an array of our objects here. So we've got a banana, we've got a book, and we'll do one more, which will be our fork.
JASON MAYES: Nice.
JASON LENGSTORF: Okay. So I'm going to -- nope, this one. This one in here. And so now we've got our objects. And so what we need to display out here is a, like, password. So we'll create a new div, and we'll call this password.
JASON MAYES: Cool, yep.
JASON LENGSTORF: I think the way we want this to work is we'll basically inject three images in here in whatever order.
JASON MAYES: Sure, yep, sounds good.
JASON LENGSTORF: So let's create a function.
JASON MAYES: Yeah, password generator, good stuff.
JASON LENGSTORF: Generate password. And this is going to -- let's see, what's a good object shuffler?
JASON MAYES: I would do map.random for the array or something like that.
JASON LENGSTORF: But we need it to be, like -- because we can't use math.random, because we'd have to track everything. Wondering if there's just a fast way.
JASON MAYES: Shuffle the array automatically. I see what you mean.
JASON LENGSTORF: Yeah, shuffle the array in place. I don't know, should we copy/paste? I think maybe we just copy/paste.
JASON MAYES: Using math.random in there. That's interesting. Either way. It's in there somewhere. Just probably not using it correctly.
JASON LENGSTORF: Wait, how are they doing that? Wouldn't that -- I'm confused.
JASON MAYES: Let me have a quick look, as well.
JASON LENGSTORF: I'm somewhat confused by the choices here. If it starts with the length of the array, then wouldn't -- would it pop in the array item off? Oh, it's swapping. It's swapping.
JASON MAYES: Yeah, swapping them around there. Yeah.
JASON LENGSTORF: I get it, that makes sense. Basically, doing a shuffle in place. I get what's happening. I'm going to copy/paste this directly from stack overflow.
JASON MAYES: Like a true coder.
JASON LENGSTORF: Just add a link. So we have a shuffle, and we'll start by -- does return that array. So going to shuffle objects. And then we will do a password for each. That's going to give us an object. And then for each object, we are going to -- we also need to get into our -- document, query selector. That password. Yes. Got our password. And then what we're going to do is -- let's see. Image, equals document, create element, image, and then we'll set the image.src to object.src and the image.alt, and container, appendChild (img). And we also need to empty -- there's a way to do this. What do you do?
JASON MAYES: Trying to reset the HTML?
JASON LENGSTORF: Just trying to empty it out.
JASON MAYES:.HTML naughty. Yeah.
JASON LENGSTORF: That's fine. So let's do that. Let's just give it a shot. See what happens when we run it.
JASON MAYES: Nice, yeah!
JASON LENGSTORF: Get a different order?
JASON LENGSTORF: Look at that. So let's make this look slightly easier to deal with. So we'll go with a password, will be display flex, and we'll do -- what do we even -- actually, why don't we do this? Make a grid, grid, template, columns, repeat. 3, 1fr. Should give us a beautiful password. Okay, so now we'll be looking at a randomized password.
JASON MAYES: Nice.
JASON LENGSTORF: We also really don't need to show the web cam, but I think for funzies, we can.
JASON MAYES: Sure.
JASON LENGSTORF: The other thing we can do is say password. -- or password image, opacity: 0.5.
JASON MAYES: Right, yeah, then we can set a class when it's been activated or something to make it show that we've got it right. Some nice visual feedback.
JASON LENGSTORF: Okay. And then we can transition that. Transition: Opacity. We'll go, like, 200 milliseconds and linear. Okay. So now we have, I think --
JASON MAYES: Yeah, the page we load -- there we go, perfect. Nice, nice.
JASON LENGSTORF: Okay. So what we don't have here is we are not currently tracking anything about the given password. So I think what we need is we should also, like, return our password, so that we can use it.
JASON MAYES: Yes, yep.
JASON LENGSTORF: Okay. And then --
JASON MAYES: That makes sense.
JASON LENGSTORF: Okay. So now that we have a password, what we want to do in here is when we're detecting objects, we want to be looking for the next -- like the next item.
JASON MAYES: We need a password index to know how far along we are, how far in the chain we are, and check the element each time to see if it's a match.
JASON LENGSTORF: Okay, yeah. So we can -- let's see, thinking through this logically. We have the current object would be zero to start, if I can spell. And then we would want to --
JASON MAYES: I think that's all we need.
JASON LENGSTORF: Yeah, so do I need to set this --
JASON MAYES: Every time we find the correct item in the loop above, we increment the counter and it will automatically go to the next thing we need to show. Right.
JASON LENGSTORF: Yeah, so then I need to put this up here, so when we're searching, we're going to say, if predictions --
JASON MAYES: Predictions, there could be more than one prediction, as we saw. So we do need to iterate the predictions, yeah. Let's look at that object again that comes back.
JASON LENGSTORF: So we're going to get an array of objects. So we're looking for -- we can just find for class of whatever the object, the first object is, and then we can check if the score is over, what, 70%, you think?
JASON MAYES: Let's go 66, two-thirds. Just on the safe side. That should be fine.
JASON LENGSTORF: Good, good, good. If we do predictions, let's see. Predictions.find. And I'm just going to shorthand this. And we'll see password. I'm going to move this whole password thing up, so we can actually -- okay. Let's just move this whole thing up. This is definitely janky code, but we're going to deal with it. So if password, current object. Right? Password, current object. Dot alt equals p.class. Is that it?
JASON MAYES: Class name. Is it class or class name? I've forgotten now. Sorry. It's class? Cool.
JASON LENGSTORF: So if we get a match. If match, then we will current object increment, and we will say --
JASON MAYES: Set the class of the one we just matched to be visible, as well.
JASON LENGSTORF: Yes, and the way we're going to do that is by saying document query selector. We'll just hack this. Alt equals match.class. Then we'll say class list, what did I call this? Found.
JASON MAYES: Cool.
JASON LENGSTORF: So, theoretically speaking, this should work. Let's wait until we get our thing here. Okay, there it is. When I show this a banana, we should see that banana fade in. Oh, my God, look at it go. Let's show a book. Book.
JASON MAYES: Okay.
JASON LENGSTORF: Show me a fork.
JASON MAYES: Look at that.
JASON LENGSTORF: Doesn't matter, it got there. Of course, it went out of bounds at the end, we could tidy that easily. That's fine. Cool, that's nice.
JASON LENGSTORF: I guess what we could do, too, we could say if current object is greater than --
JASON MAYES: Equals the length.
JASON LENGSTORF: That's right.
JASON MAYES: Say our congratulations message or something.
JASON LENGSTORF: So we can basically kill this loop once we have hit the thing, and then for now, we can do document.querySelector, current predictions, innerText, unlocked. Okay, so this is like the jankiest of jank, but I think this works. I think what we should do now -- banana.
JASON MAYES: Nice.
JASON LENGSTORF: Fork.
JASON MAYES: Yep, good, good.
JASON LENGSTORF: Book. Unlocked. And then the loop is done. We still got one wrong. Aha! Behold, my bucket! Oh, wait. If match, alt -- why is it giving me an alt issue?
JASON MAYES: What was the error? I didn't quite see the error there. My bad.
JASON LENGSTORF: Password, current object. So the error is that it's missing a -- when we get to the top, so we showed a book.
JASON MAYES: Yeah. Showed a banana. And we showed a fork. Then it tries one time to get an object that doesn't exist, which means that the alt is -- so this is the alt.
JASON MAYES: I see.
JASON LENGSTORF: Oh, oh, oh, oh.
JASON MAYES: How is it called again?
JASON LENGSTORF: I know what it is. We need to be checking up here. So if, then we will do this. And return. And down here, we'll just keep this as it was. Okay. So this should actually work. So, check to see -- here we go, corgi time. I still haven't added the sound back in. I apologize. We'll get a good parade going one of these days. What we're doing here, we have the model being detected, and once we have our predictions back, it checks to see if we've already gotten the whole password. If we've already gotten the whole password, then we set it unlocked and return and it bails on this whole function. We're done. If we're not there yet, we get a match by looking at whether or not one of the predictions matches the next object in the current object. So the alt text matches the class names, which is why this is working. If we get a match, then we set the matching class to found, so that it fades in, and we move to the next item in the object. And then we start the whole loop over again. So what we were doing wrong before is we were starting, we were doing the increment, and then it was running the loop one last time.
JASON MAYES: But already incremented. So out of bounds.
JASON LENGSTORF: So now this should, let's see, if we did this all right, what we'll do is get our password in.
JASON MAYES: Nice.
JASON LENGSTORF: Look at my fork, come on.
JASON MAYES: Fork is blending with the background.
JASON LENGSTORF: Got it, okay. And a book. There we go. Unlocked. No error. We have done it.
JASON MAYES: Wahoo!
JASON LENGSTORF: That's it, right, that's the thing. You all can go try this right now. Beautiful. I'm really excited. We're also, surprisingly, ahead of schedule here.
JASON MAYES: I got something I can show you, which could be good -- if we got -- how much time do we have left? What's timing at?
JASON LENGSTORF: We have about 20-ish, 25 minutes.
JASON MAYES: I'm going to throw out there, there's a really easy way to take five of those minutes off to recognize any object in the world in five minutes.
JASON LENGSTORF: Let's do it. Yeah, let's do it.
JASON MAYES: If you go to your favorite search engine and go to Teachable Machine. You should find a website called Teachable Machine. I believe it's with Google.com. Teachable Machine with Google.com. Only difference is we're using TensorFlow.js for its true nature to create machine learning models using data we're going to feed live with the web browser. What's cool is all the training data we're going to give it is never sent to a server, so your privacy is preserved, which is nice. So none of this data will ever be sent anywhere. Click on Get Started, the blue button there. Currently, you can see that it supports -- whoop, three things at the moment. Hopefully, this will grow at time. It can detect objects, audio, or poses. Just to iterate on the object stuff for now, we go to images, but let's go to images for now. Image project. You get this really nice interface. Basically, on the left-hand side, this is the one-on-one of machine learning. You have a number of classes you want to learn how to recognize, essentially. So we need some training data in each class. I don't know what you got over there. If you have interesting objects, but you're pretty sure it's not going to recognize, feel free to grab it. Maybe your face versus something else. We can recognize you versus some other thing maybe. Password if you like, take a pencil and give it a name for something meaningful. Whatever you have around.
JASON LENGSTORF: I have two watches. Is that a good one maybe?
JASON MAYES: Possibly different. Are they the same?
JASON LENGSTORF: They are different watches, but wristwatches.
JASON MAYES: As long as they look visually different, they should be able to detect the difference. This is very risky. Do it, let's try this. Class 2, whatever you want to call the other watch. So click on web cam for the first watch. Allow access to the web cam in the pop-up. Now we have a live preview of the web cam data coming in. Hold your watch up to the cam nice and close. Not too close, but so it's visible and you can see the detail. What I want you to do is in a minute, once I finish talking, click on the hold to record button but move your hand so it gets a variety of angles on the watches. Move it around. See the samples pop up on the right. So hold the button and move it around. Perfect. Plenty. Loads of images. That's good. More images you use, more time it takes to train, obviously. Just FYI. Go to the other watch and do the same down there and try and get the same number of images. So 80, get roughly the same. That's important.
JASON LENGSTORF: All right perfect.
JASON MAYES: That's close enough. Now click on train model in the middle there. And it's going to do something called transfer learning. And what that is, we have the cocoSsd, that already knows how to understand various objects in the world, like 90 of them, and it's learned something about the world by learning both 90 objects. Knows how to detect edges, shapes, so on and so forth, so it can use the knowledge and try and recognize other things. Obviously, it thinks you're watch number one. This is another thing to point out, watch one. Now bring in the second watch. And you have to bring it to the same kind of closeness. Watch 2 is now predicted. It does indeed detect a difference. We did that in, what, 30 seconds? You can click on the export model at the top, and we can then use that on any website we wish to unlock your website with your watches instead.
JASON LENGSTORF: Oh, that's cool. That's super cool. I like that. Yeah, I love the -- I love the approachability of that. Because the thing that I also really like about this is one of the things that we talk about in training data is always bias. So with something like this, you can bring your own data set, and you're able to do, you know, you can help ensure that whatever the thing you're trying to do, you train a model that way. Obviously, the real data set, I'm going to take a wild guess and you can correct me. In the object recognition model, I'm guessing that we're talking tens of thousands of images that got trained.
JASON MAYES: Each object. Just imagine for a second we wanted to recognize cats. In order for the machine learning to best understand what cat pixels really are, we need to understand kittens versus adults, examples of both of those. Different fur colors, different fur patterns, taken at different angles, different lighting conditions, in the grass versus in your house. All that stuff matters. Otherwise, you'll end up with bias in the model. For your watches right now, for example, if we only showed both watches, if we showed a bright blue watch, it might think that's not a watch. We need the variations to kind of make it learn that there are differences in these things and color doesn't matter. More like the shape that's important here or something like that. So with enough training data, you can get to that level of accuracy, essentially, that we can detect and generalize better. Of course, teachable machine is great for prototyping. I've seen people use this to control their garage doors, so if it's open and dark outside, connect this and it can shut it automatically and that kind of stuff. So the possibilities are endless what you can do with this. For prototyping, this is great for things that you have at home, all that kind of stuff. If you want to make something that's production ready, you can do something called Cloud auto ML, and it will go away and churn for hours and give you a model to export at TensorFlow.js at the end.
JASON LENGSTORF: Is this the one you're talking about?
JASON MAYES: Not AutoML. Sorry, I'm talking nonsense. This is the one. If you scroll down, there's an image classification one somewhere there. And, essentially, you can see it talking about it. Auto-ML vision. So you upload gigabytes of data in each one of those folders and it will churn for days or hours, however long it takes. It will produce a model that it thinks is the best. What's really cool about this is it will search different types of ML models. We were using cocoSsd, image net, but there are other things that exist, too. Google will try to see what gets the best performance on your data. Depending what type of data you have, some things might perform better than others. And it will try to tweak parameters you set and tries all these combinations so it makes it super easy to use, which is nice. These two things are great for beginners to get their feet wet, so to speak with machine learning, and then you can peel back the onion layers, so to speak, and cry some more as you get into the mathematics. It took me, like, a couple of years to really understand what was going on below all of that high-level stuff. I actually made a deck called machine learning 101 or something like this. If you Google it, you'll find it. That's, basically, my two years of aha moments in one deck. Then it goes through everything that I needed to understand in order to appreciate what's going on behind the scenes.
JASON LENGSTORF: Where is --
JASON MAYES: It's not that hard. You just need to learn how to add and multiply. It's really easy. Without that, it was really scary. Look at the research papers, what's going on here? I don't have any understanding what they are talking about here. But if you explain it simply, it is possible to understand from a high level what's going on.
JASON LENGSTORF: Where would we find that deck?
JASON MAYES: Put my name and machine learning 101 deck, something like this. That should pop up. Should be a link to the presentation itself. Google slides. That might be the one. There, that's the one. Yeah. This is 101 pages of joy. So, basically, good bedtime reading. I've tried to make it a bit fun. There's videos, nice animations. And it will walk you through from completely zero knowledge of machine learning, to having an understanding that this is not magic at least, and you know what the limitations are, and what it's good for, what it's bad for, that kind of stuff. And with that, you can take your first step to then go deeper, hopefully, at least.
JASON LENGSTORF: Nice. Well, this is very cool. I mean, I feel like this is -- I'm not going to lie, if any episode was going to go short, I didn't think it was going to be the one on machine learning. I thought we were definitely going to have, like, a lot to cover. So I actually think that is -- I feel like that's actually more and more true of some of the things that I would have considered to be too hard. If you had asked me to build a machine learning driven password protection for a section of a site, obviously, this is not production ready. We're not going to take this and throw this out into the world and be like, yeah, it's definitely secure. However, the fact that we were able to do this, this quickly, I think is a good signal that this stuff is more approachable than we may have thought. If we want to get into machine learning, start playing with this stuff. Like you said, it's not advanced math. We're not doing, you know, we're not getting into advanced computer science here anymore. We're now able to do, effectively, web dev. And if we need to --
JASON MAYES: Exactly, developers tinkering with a black box. 99% of the time, you don't need to make your own models from scratch. There will be cases where you do, and maybe you're making a new thing that's never been done before, but some of the kind of things like, you know, sound recognition, or object detection, these kinds of things are almost, dare I say, sold problems in a way and you can reuse an existing model, train it on your own data, and then just use that. And you don't have to go and build it yourself from the low-level Legos, as I was talking about earlier.
JASON LENGSTORF: Absolutely.
JASON LENGSTORF: Muffin or chihuahua. I bring this up, the chat was saying, oh, you should use foxes or cats or dogs. It reminded me of this. One thing that's probably worth talking about in the last couple of minutes that we have here is the limitations of this. Right, because it's not -- this isn't something that's just going to work. It's not human intelligence. It's trained data.
JASON MAYES: Mathematics. We must remember, it does not have any intelligence like you and I would have in our human brains. So, you know, show it a muffin, and it would call it a muffin for sure. But by having that training data available, you can then reduce these biases that might be learned by the models. So if we have this chinchilla dog and muffins as two separate classes, it would be able to figure out the differences why that's a muffin versus a chinchilla.
JASON LENGSTORF: I'm, obviously, not going to lie, if you gave me this list, I'm pretty sure I would misclassify a few of these dogs as muffins.
JASON MAYES: Should I eat it?
JASON LENGSTORF: Yeah, we'll end up petting a bunch of muffins. It's all going to go poorly.
JASON MAYES: Got to be very careful there.
JASON LENGSTORF: Yeah. But I do think that's worth stating again. Machine learning is not intelligence in the way that we would think of it as applied to humans. Machine learning is a mathematical model that is measuring, like, the distance between pixels or something. It's not necessarily like I know what a dog is. It's, I know what images of dogs mathematically look like. And that, once that starts to click, I feel like that makes all of this a little less mysterious.
JASON MAYES: Yes, yes, more like statistics. I've seen 1,000 images of dogs, and on average, a dog has some of these features, but I now know how to detect. And if I detect the same features, same proportions and ratios and that kind of stuff, probably it's a dog. That kind of stuff. So there are, like, research papers where you can even attack machine learning models, one pixel, and you can then make it turn into a panda, just by changing the color value to be something very specific. Know how the architecture works, not going to happen by random, or very unlikely, at least. But it's possible to fool the mathematics so it comes out with a different calculation. People really low level can figure out how to do that kind of stuff. Yeah, once again, more training data helps alleviate these problems and so on and so forth.
JASON LENGSTORF: Excellent.
JASON MAYES: This is ongoing research topic, obviously.
JASON LENGSTORF: Yeah. So this is amazing. Let's play this one more time. Let's play this game one more time. We are going to unlock a section of the website using our computer vision here. So what we've got is we have the object classification running right now, and it's looking at my web cam so. This is the web cam on my Mac. And it is currently seeing me as a person. I need to show it a banana. So I show it a banana. It sees a banana, we see that light up there. Now it wants to see a fork. Let's show it a fork. Okay. Sees a fork. Now I need to show it a book. Show it a book. There we go. I have now unlocked this section of my website. And to do all of that, all we had to do was write less than 100 lines of code. Which is fascinating to me. It's just amazing how approachable that is.
JASON MAYES: Right? And most of it is the machine learning.
JASON LENGSTORF: Yeah, yeah, that's fair. Yeah, it's mostly the logic, right, the logic of doing the password part, not the machine classification stuff.
JASON MAYES: Which is two lines to import. And use the cocoSsd. Yeah, really cool.
JASON LENGSTORF: I just realized I posted a bad link. Let me post a better one. Nope, nope, Twitch just really does not like those links. That's fine. You can copy/paste that out, you wouldn't be able to click it. These will all be in the show notes. For someone that wants to take this further. Where should someone go from here?
JASON LENGSTORF: Endless muffins!
JASON MAYES: And also, I managed to negotiate a discount with Mannings. If you use made with TSJF, you can get 30%.
JASON LENGSTORF: 30%?
JASON MAYES: 30%. It's the only book I know of right now that talks about TensorFlow.js like this. There's also courses on Coursera. A course there does touch on TensorFlow.js. I'm not sure if it's dedicated or part of a general TensorFlow itself, but that could be useful for some folk. But I'd recommend the book and the kind of website as a starting point. And then if you need more of a video-based tutorial, check out the Coursera stuff. Also Gant made a course for the low-level Tensor stuff. Why is it called TensorFlow? Tensors are functions added on for good measure. As I spoke about, these tensors contain numbers. This is the mathematics I talked about to do machine learning. So the tensors contain lots of numbers, and Gant talks about how to use and manipulate the tensors, essentially, with his course on the academy there. It's a great course, if you want to understand the lower level stuff and know how to manipulate tensors and other things. That's useful if you want to take the web cam data yourself and turn that into a tensor to use with a different model that isn't a pre-made one. Understanding how tensors work is very important.
JASON LENGSTORF: Very cool. Also, I think this is -- this is a really, really fun space to be in. Looks like there's a lot of amazing stuff we can do. And what we did today is just kind of the very, very tip of the iceberg.
JASON MAYES: Totally, yeah.
JASON LENGSTORF: It looks like there is a huge amount of additional content, if you want to move on and go forward. Make sure you go and follow Jason on Twitter. @Jason Mayes. Anything else you wanted to add before we wrap up?
JASON MAYES: One quick thing at the end. Please go and try stuff out. Try the Teachable Machine. And if you end up making something cool with that, unlock your garage door or whatever you want to do, use the hash tag so we can find you on social media, Twitter, LinkedIn, whatever your preference is. You have a chance to be featured on our future show and tells or a blog post to get you more visibility. If it's particularly cool, we'll reach out to you. Yeah, keep in touch, and look forward to hearing from you.
JASON MAYES: I'll look into that. It was working last week. Not sure what's happening.
JASON LENGSTORF: I also might have typoed it.
JASON MAYES: Should be just madewithTSJF, one word.
JASON LENGSTORF: All upper case?
JASON MAYES: Lower case is fine, I think.
JASON LENGSTORF: Okay, madewithtfjs. Okay, with that, chat, thank you so much for hanging out today. Stay tuned. We are going to raid. Jason, thank you so much for hanging out, teaching us today. This was super fun. I'm really excited to see what else we can do with this machine learning stuff. And on that note, we'll see you next time. Thanks, y'all.
JASON MAYES: Thank you very much for having me. Cheers.
Closed captioning and more are made possible by our sponsors: