Conf42 Python 2024 - Online

Personalizing Your Images with AI-Powered Features

Video size:

Abstract

Use AI for bulk image personalization, including smart crop and resize, background customization, object detection & captioning. For instance, dynamically determine orientation (profile or landscape) based on content, add sunglasses to faces & overlay captions matching the image’s predominant color.

Summary

  • Cloudinary's Sharon Yelenik talks about personalizing your images with aipowered features. AI can help you manage your images by analyzing the image and returning all the information for you. It can also automatically moderate your images so that everything that you're showing on your website is appropriate.
  • AI can be used to handle our images better in our websites and apps. Let's take a look at a sample app, a little demo, to see how to put those concepts into practice.
  • How do we use AI to handle our images better on our websites and apps? The solution that I want to talk to you about today is with cloudinary. Cloudinary handles your images from end to end, meaning from uploading, analyzing, storing, managing, editing, all the way to delivering your images and videos.
  • The upload preset enables me to upload all of the images from the client side without any signatures. The parameters that I'm passing are actually activating the analysis. On success I'm going to display the images that were uploaded and then the next button that will bring me to my output.

Transcript

This transcript was autogenerated. To make changes, submit a PR.
Hi, my name is Sharon Yelenik. I'm API, API and Devex content writer Cloudinary. And today I'm going to talk to you about personalizing your images with aipowered features. So first I want to touch on how you handle your images today. Could be that you handle your images with a graphic designer. If you only have a few images on your website, it's very likely that you have a graphic designer who looks through every image, crops it properly, adds the proper text, maybe localizes it per country that you service, maybe puts a proper logo on it to watermark your images. And basically the graphic designer has their eye on every single image in order to prepare it for delivery. The problem is that when you have a website with lots and lots of images, then it becomes really impossible to have a graphic designer or yourself go through every single image and personalize it individually. Also, if you have user generated content on your website, you also can't really go through every single image because most likely it just goes directly from user upload to be delivered on your website. And if you have any of these scenarios where you either have a lot of images or you have user generated content appearing on your website, you probably have to up your game and do something else besides going through every single image manually in order to prepare it for delivery. And most likely you're using some sort of AI to handle those images for you. Now, AI can help you manage your images by analyzing the image and returning all the information for you that you need in order to make programmatic decisions based on the information that's given. So I'm going to go through a list of a few types of information that you can get which will really help you to programmatically handle and automatically handle all the images on your website. So first of all, moderation, especially if you're using user generated content, you need some way to automatically moderate each image and make sure that there's nothing unlawful or inappropriate going on your websites. So AI can do that for you. It can automatically moderate your images so that you can feel comfortable that everything that you're showing on your website is appropriate. Object detection. So AI can actually go through each image and figure out which objects are appearing in it. And as you can see in this little example here, in this image, we've got a bounding box around car, lady, jacket, bag, and each object actually has a bounding box. And you can actually get the coordinates so that you can know not only what images or what objects are in the image, but you can even know where those objects are located in the image. Auto tagging is probably a twin of object detection because how am I going to access the information later that I detected which objects are in the image, but how am I going to access that information later? So you can auto tag the image based on all of the objects that are there. And basically auto tagging will save that information as a tag on each of your images. So, for example, this image would be tagged with all of these objects that have been found in it. Face detection. It might be very important for you to know which images have faces and which images don't have faces. And you might handle images with faces and without faces differently. You can also get the information of how many faces are in the image. And you can also, again, get the same sort of coordinates and bounding box for each face that's detected. Now, I'm going to show you an app in a few minutes, something that kind of demonstrates how we can receive all this analysis on our image and then use that information in our program. And we're actually going to do something playful and we're going to detect which images have faces and put sunglasses on each of those faces that are found. Another very useful piece of information that you can get is color detection. You can automatically, with AI, receive information about what the predominant colors are in the picture and act accordingly. Maybe you want to group the pictures with similar colors together for aesthetics. Maybe you want to add a background of the predominant color or a border of that color. And finally, OCR AI can detect any text that's in the image and can give that information to you so that you can save it and programmatically make decisions based on that later. And of course, there's lots more, there's lots more analysis that AI can do on your images. These are just an example of a few. So now that we've discussed a little bit about how to use AI to handle our images better in our websites and apps, let's take a look at a sample app, a little demo, and we're going to look at the code for it a little bit later to see how to put those concepts into practice. So in this demo, we're going to request user generated content. We're going to perform analysis on the uploaded images, and then we're going to display them according to the analysis that was made on upload. So let's take a look. We're going to upload the images and the analysis is being done as we speak. And we're going to click show details to see the images displayed based on the analysis that was done. So, first of all, we've got two categories. We've got people and we've got shoes. So according to the object analysis that was done, the people are placed in the area for people, and the shoes are placed in the area for shoes. And if you can just take a look at the images in particular, you'll see that each image is transformed according to the analysis that was done on it. So all of the people are now wearing sunglasses. None of the shoes are wearing sunglasses. Also, we've chosen the predominant color in each image, and we've put a border of that color around the image. We have information underneath each images that describes what we found out about it. If there are any words, then it tells us what the word is that was detected, the number of faces and the coordinates of those faces, the orientation of the picture, and the predominant color. One other thing that I just want you to pay attention to is that if you see, we actually changed these images from being whatever shape they are, portrait or landscape, and we made all the pictures square. Now, there's a risk in doing that. When you change the bounding box, when you change the size and the aspect ratio of images, you run the risk of either chopping off part of the important part within the crop, or distorting the images. And here we've used AI to do something else. We put the picture into the shape that is requested, in this case, a square. And we've out painted the picture according to AI in order to make the picture fit into the requested area. So you see that in the original picture, we didn't have this content, and actually this picture within a picture was extended and AI was based to fill out the picture in the best way possible so that we don't have any distortions or false image sizes. And here we see that the same thing is with the shoes. The same information has been extracted with the shoes, the same formatting, the same color detection. And last but not least, please take note that the very relevant titles on each image are comprised by the auto tagging that we added in our image analysis on upload. So let's move on to talking about how. How do we use AI to handle our images better on our websites and apps? The solution that I want to talk to you about today is with cloudinary. And actually, cloudinary is what was powering the demo app that you just saw before. So what is cloudinary? Cloudinary provides you with a cloud to store all your images and videos and any other type of file that you want to store there. And cloudinary handles your images from end to end, meaning from uploading, analyzing, storing, managing, editing, all the way to delivering your images and videos for your websites and apps. So how do we access cloudinary? You want to make a free account, go to cloudinary.com and sign up for free and follow the instructions there. And when you actually get into your own account, you will see that you get this very nice UI. We're going to make sure that we're in the programmable media section and that will give us access to our cloud name, our API key, our API secret, and the environment variable, which provides the credentials that we need in order to get into it from our code that we're going to write. So please, if you want to follow along, you could copy the API environment variable for future use. We're going to get to that in just a few minutes. In addition, I want to show you the nice UI that you get along with cloudinary. And this actually allows you to see visually all the images that were uploaded and everything that you can do with your images and videos visually. You can also do programmatically, pretty much. So here are all your images. Let's double down and open a single image to manage it. And you can see that every single image has a public id, which is its unique identifier. And you can also open the image in a browser window using its delivery URL. And this is the delivery URL that you can use to deliver the image on your website or app anytime and any place. So the last step that I need to go through before I can actually go into the app code is to install cloudinary. So I'm going to install cloudinary here in my terminal. And also if you don't have it yet installed, if you don't, just use it in your own apps. For this particular app demonstration, we're going to use the env library in order to import my credentials from a nv file. So you can install that as well if you want to follow along with this particular app. So here we're going into my actual code, and in my app, I decided to name the main file demo py. Of course you can name it whatever you want. And the first thing that I do is I import the environment variable from the M file using the MV library. And if we take a look at the M file, you see that that's my environment variable that I copied from the dashboard a few minutes ago. And in my demo py file, the first thing that I need to do is to create an upload preset. Now, this upload preset should only be created once. And so if you're actually running this app and doing it exactly as it is here, you want to comment it out before you run the app a second time. Now of course you might want to run this separately in just another script if you're going to do it in your apps on a regular basis. Okay, so let's take a look at the upload preset. So the first thing that I'm going to do is I'm going to call the upload preset cloudinary API create upload preset, and in order to use this, I have to import cloudinary. Other libraries that I have to import from cloudinary are the uploader and the API libraries. And within the upload preset, I'm going to pass a bunch of parameters. First of all, the name. I'm going to name it. I'm going to give it a certain name. Of course you can name it whatever you want. I'm going to make it an unsigned preset. This enables me to upload all of the images from the client side without any signatures and without having to use my API key and secret use file. Name equals true. That means that whatever the image name is that I upload, that's also going to be its public id name within cloudinary. And I'm specifying the folder that it's going to be uploaded to within the UI, and I'm giving it a certain tag so that I know that each image that's uploaded with this preset has come from this computer vision demo, and the remainder of the parameters that I'm passing are actually activating the analysis. Color equals true means that I'm going to track the predominant color of the images that I upload with this preset. Faces equals true means that I'm going to detect the faces, I'm going to count them, I'm going to return the coordinates for each face. Customization equals Google tagging means that I'm going to detect objects in the images that are uploaded, with auto tagging being 0.7. That means that everything that was detected, any object that was detected with a confidence of 70% is going to be based as a tag on that image. Ocr this parameter activates my text detection, my Alphabet detection, my word detection and moderation means that I'm activating AWS recognition that add on to actually monitor my pictures and make sure that I'm not including any inappropriate content or anything illegal there. So let's take a look at how I apply the upload preset to the images that I upload. So let's go to the index HTML file and take a look. I'm going to create an upload widget and let me remind you what that upload widget was. In my demo, I clicked upload files and there appeared this nice little UI that allowed me to browse my local files and upload the images of my choice. So the button with id upload widget calls the Javascript in order to instantiate that upload widget. Now I need to add this script from Javascript and then I need to instantiate my upload widget with a cloud name and the name of the upload preset that I had just created. On success I'm going to display the images that were uploaded and then I'm going to follow through and display the next button that will bring me to my output. So when I click the show details button I'm going to submit the form and that's going to route me to the output function in my demo py file. Let's take a look at what happens in the output function. So first of all, I'm going to get a list of all the assets, all the images that I uploaded using the upload preset and labeled as computer vision demo. Next, I'm going to loop through every single one of those assets and get all of the details that I gathered using the analysis that I set in the upload preset. Cloudinary API resources actually returns a JSON object which contains all the information that we gathered on analysis, and it also includes the images delivery URL. And here you can see an example of what that might look like. Now let's take a look at the parameters that we use in that method. I'm going to call each image by its public id. That's how I'm going to specify the image. And faces equals true means that I'm going to gather all the information that I got on the faces. Color equals true means I'm gathering all the information that I got on the predominant color. First thing I'm going to do is I'm only going to display the image if the moderation status is approved, meaning that any image that wasn't deemed to be appropriate or wasn't deemed to be legal is going to not appear on my website. The next thing that I'm going to do is I'm going to save the delivery URL of the particular images in the variable URL. Now, I didn't discuss this earlier, but you should know that one of the hallmark collateral functionalities is that I can actually do edits and transformations on the images by adding details, by adding parameters to its delivery URL. Meaning if I want to crop or add an effect or add an overlay, all I have to do is programmatically change its delivery URL and voila, you'll see it affecting how the image actually is displayed. So we're going to look at that in a second. The next thing that I'm going to do is I'm going to build the title. Now, how am I going to build the title? If you remember, I used object detection in my upload parameter and I auto tagged each image based on the objects that were detected. So I'm going to retrieve those tags and that's how I'm going to build what I display in my title for each image. Now, the message on the bottom, all the information that I displayed underneath each picture, I'm also going to build that. So first of all, the object detection, if it detected some sort of phrase, it's going to display that in the message. If there was no message, if there's no text detected, then it's going to tell me that also faces, if I don't detect any faces, I'm going to display the message. There aren't any faces, but if I do detect one or more faces, then I'm going to display that and I'm going to also print out the coordinates of the detected faces. I'm going to also be able to output whether the image is landscape or portrait orientation by dividing the width and height of the image as detected by the analysis. I'm going to take the predominant colors and print that out. And finally, and most excitingly, I'm going to take the deliver URL and I'm going to create a transformation out of that. Okay, in the deliver URL, I'm going to say that I want to have a width of 600, a height of 600, making it square. I'm going to set it to paint out the background. Remember what we said, that I don't want to squash my image, I don't want to distort it, I don't want to crop off any parts of it. What I want to do is I want to fill in the parts that are kind of left blank when I shrunk it to fit into a square. And I'm going to fill that in with AI so that actually the picture is automatically and beautifully painted out to fit the bounding box of a square. This is where I overlay the sunglasses onto each face and if I do say so myself, it actually did quite a good job in placing those sunglasses right on top of the eyes of every single image that was of every single face that was detected in the image. I'm going to create a border out of the predominant color and the last thing that I'm going to do within the transformation is also something that is a cloudinary feature which is optimizing each image. Quality equals auto and fetch format equals auto. Actually optimize your image so that the loading time for your website is reduced and your web performance is increased by a ton. So what quality equals auto does is it compresses all of your images automatically so that you've got the best compression versus keeping your images as clear and as high quality visually as possible. What fetch format does is it delivers the image in the optimal format for the requesting browser so that each time you automatically get the image compressed and delivered in the best format and compression possible for that image for the requesting browser. Finally, we finished with our transformations. We finished appending our messages and the last thing that we want to do is we want to sort through based on the tags and the auto tagging that we saved for each image. And we want to make sure that we take the people and we put them in the people category and the shoes and we put them in the shoe category. And of course you can go on and on and on and you can build this out to whatever your needs are, whether you're displaying accessories or clothes or shirts or dresses or menswear or women's wear, children's wear, whatever you want. You can definitely use the AI and the auto tagging to display the images in the proper location. And of course, the outcome of this function is to render output HTML where all our images are delivered, using the analysis and applying it to them appropriately as we instructed on our program. So hopefully you've gained a little bit of insight and a little bit of inspiration from this app. You've gotten a taste of what AI can do for you and how it can help you handle your images and videos in your websites or apps when you're using a large volume of images or user generated content. And feel free to take any of these features that we've shown here on our little demo and use them as appropriate for your use case and your website or app. This code is available on GitHub and I've provided a link in the corresponding presentation. Now I've also provided a few links that you can go ahead and learn some more in our documentation, including seeing AI in action transformations, you've got a listing, a full listing of all the transformations that are available, all the parameters that you can add and all the effects that you can add, all of the optimizations and transformations and cropping methods that you can utilize just by adding parameters inside your delivery URL. Of course, the optimization make sure that compression for your images are optimal, meaning that they are the smallest size they could be, while not losing any visual quality that you're delivering your images in the optimal format for the requesting browser. All of the options that you can set on your upload presets. We've seen just a few of them in this app. Images analysis, all the features, all the ways in which you can ask AI to analyze your images, and of course the upload widget, which is a cute little UI for user generated content on your e commerce websites. So I hope you enjoyed this presentation, and I hope you enjoy the rest of the talks in content reference 42. And please, I welcome you and I encourage you to experiment with your new free account in cloudinary and let us know how you've used cloudinary in our community and Discord Channel.
...

Sharon Yelenik

API and Devex Content Writer @ Cloudinary

Sharon Yelenik's LinkedIn account



Join the community!

Learn for free, join the best tech learning community for a price of a pumpkin latte.

Annual
Monthly
Newsletter
$ 0 /mo

Event notifications, weekly newsletter

Delayed access to all content

Immediate access to Keynotes & Panels

Community
$ 8.34 /mo

Immediate access to all content

Courses, quizes & certificates

Community chats

Join the community (7 day free trial)