Plato Data Intelligence។
ការស្វែងរកបញ្ឈរ & អាយ។

ដោះសោភាពច្នៃប្រឌិត៖ របៀបដែល AI និង Amazon SageMaker បង្កើតបានជួយអាជីវកម្មបង្កើតការច្នៃប្រឌិតសម្រាប់យុទ្ធនាការទីផ្សារជាមួយ AWS | សេវាកម្មគេហទំព័រ Amazon

កាលបរិច្ឆេទ:

Advertising agencies can use generative AI and text-to-image foundation models to create innovative ad creatives and content. In this post, we demonstrate how you can generate new images from existing base images using ក្រុមហ៊ុន Amazon SageMaker, a fully managed service to build, train, and deploy ML models for at scale. With this solution, businesses large and small can develop new ad creatives much faster and at lower cost than ever before. This allows you to develop new custom ad creative content for your business at low cost and at a rapid pace.

ទិដ្ឋភាពទូទៅនៃដំណោះស្រាយ

Consider the following scenario: a global automotive company needs new marketing material generated for their new car design being released and hires a creative agency that is known for providing advertising solutions for clients with strong brand equity. The car manufacturer is looking for low-cost ad creatives that display the model in diverse locations, colors, views, and perspectives while maintaining the brand identity of the car manufacturer. With the power of state-of-the-art techniques, the creative agency can support their customer by using generative AI models within their secure AWS environment.

The solution is developed with Generative AI and Text-to-Image models in Amazon SageMaker. SageMaker is a fully managed machine learning (ML) service that that makes it straightforward to build, train, and deploy ML models for any use case with fully managed infrastructure, tools, and workflows. ការសាយភាយមានស្ថេរភាព is a text-to-image foundation model from ស្ថេរភាព AI that powers the image generation process. ឌីផូសឺរ are pre-trained models that use Stable Diffusion to use an existing image to generate new images based on a prompt. Combining Stable Diffusion with Diffusers like ControlNet can take existing brand-specific content and develop stunning versions of it. Key benefits of developing the solution within AWS along with Amazon SageMaker are:

For this post, we use the following GitHub sampleដែលប្រើ ស្ទូឌីយោ Amazon SageMaker with foundation models (Stable Diffusion), prompts, computer vision techniques, and a SageMaker endpoint to generate new images from existing images. The following diagram illustrates the solution architecture.

លំហូរការងារមានជំហានដូចខាងក្រោមៈ

  1. We store the existing content (images, brand styles, and so on) securely in S3 buckets.
  2. Within SageMaker Studio notebooks, the original image data is transformed to images using computer vision techniques, which preserves the shape of the product (the car model), removes color and background, and generates monotone intermediate images.
  3. The intermediate image acts as a control image for Stable Diffusion with ControlNet.
  4. We deploy a SageMaker endpoint with the Stable Diffusion text-to-image foundation model from SageMaker Jumpstart and ControlNet on a preferred GPU-based instance size.
  5. Prompts describing new backgrounds and car colors along with the intermediate monotone image are used to invoke the SageMaker endpoint, yielding new images.
  6. New images are stored in S3 buckets as they’re generated.

Deploy ControlNet on SageMaker endpoints

To deploy the model to SageMaker endpoints, we must create a compressed file for each individual technique model artifact along with the Stable Diffusion weights, inference script, and NVIDIA Triton config file.

In the following code, we download the model weights for the different ControlNet techniques and Stable Diffusion 1.5 to the local directory as tar.gz files:

if ids =="runwayml/stable-diffusion-v1-5": snapshot_download(ids, local_dir=str(model_tar_dir), local_dir_use_symlinks=False,ignore_patterns=unwanted_files_sd) elif ids =="lllyasviel/sd-controlnet-canny": snapshot_download(ids, local_dir=str(model_tar_dir), local_dir_use_symlinks=False) 

To create the model pipeline, we define an inference.py script that SageMaker real-time endpoints will use to load and host the Stable Diffusion and ControlNet tar.gz files. The following is a snippet from inference.py that shows how the models are loaded and how the Canny technique is called:

controlnet = ControlNetModel.from_pretrained( f"{model_dir}/{control_net}", torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32) pipe = StableDiffusionControlNetPipeline.from_pretrained( f"{model_dir}/sd-v1-5", controlnet=controlnet, torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32) # Define technique function for Canny image = cv2.Canny(image, low_threshold, high_threshold)

We deploy the SageMaker endpoint with the required instance size (GPU type) from the model URI:

huggingface_model = HuggingFaceModel( model_data=model_s3_uri, # path to your trained sagemaker model role=role, # iam role with permissions to create an Endpoint py_version="py39", # python version of the DLC image_uri=image_uri,
) # Deploy model as SageMaker Endpoint
predictor = huggingface_model.deploy( initial_instance_count=1, instance_type="ml.p3.2xlarge",
)

Generate new images

Now that the endpoint is deployed on SageMaker endpoints, we can pass in our prompts and the original image we want to use as our baseline.

To define the prompt, we create a positive prompt, p_p, for what we’re looking for in the new image, and the negative prompt, n_p, for what is to be avoided:

p_p="metal orange colored car, complete car, colour photo, outdoors in a pleasant landscape, realistic, high quality" n_p="cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, blurry, bad anatomy, bad proportions"

Finally, we invoke our endpoint with the prompt and source image to generate our new image:

request={"prompt":p_p, "negative_prompt":n_p, "image_uri":'s3://<bucker>/sportscar.jpeg', #existing content "scale": 0.5, "steps":20, "low_threshold":100, "high_threshold":200, "seed": 123, "output":"output"}
response=predictor.predict(request)

Different ControlNet techniques

In this section, we compare the different ControlNet techniques and their effect on the resulting image. We use the following original image to generate new content using Stable Diffusion with Control-net in Amazon SageMaker.

The following table shows how the technique output dictates what, from the original image, to focus on.

Technique Name ប្រភេទបច្ចេកទេស Technique Output ប្រអប់បញ្ចូល។ Stable Diffusion with ControlNet
canny A monochrome image with white edges on a black background. metal orange colored car, complete car, colour photo, outdoors in a pleasant landscape, realistic, high quality
ជម្រៅ A grayscale image with black representing deep areas and white representing shallow areas. metal red colored car, complete car, colour photo, outdoors in pleasant landscape on beach, realistic, high quality
ហត់ A monochrome image with white soft edges on a black background. metal white colored car, complete car, colour photo, in a city, at night, realistic, high quality
អាចសរសេរបាន A hand-drawn monochrome image with white outlines on a black background. metal blue colored car, similar to original car, complete car, colour photo, outdoors, breath-taking view, realistic, high quality, different viewpoint

សម្អាត។

After you generate new ad creatives with generative AI, clean up any resources that won’t be used. Delete the data in Amazon S3 and stop any SageMaker Studio notebook instances to not incur any further charges. If you used SageMaker JumpStart to deploy Stable Diffusion as a SageMaker real-time endpoint, delete the endpoint either through the SageMaker console or SageMaker Studio.

សន្និដ្ឋាន

In this post, we used foundation models on SageMaker to create new content images from existing images stored in Amazon S3. With these techniques, marketing, advertisement, and other creative agencies can use generative AI tools to augment their ad creatives process. To dive deeper into the solution and code shown in this demo, check out the GitHub repo.

Also, refer to ក្រុមហ៊ុន Amazon Bedrock for use cases on generative AI, foundation models, and text-to-image models.


អំពីនិពន្ធនេះ

សុវិក គូម៉ាណាត គឺជាស្ថាបត្យករដំណោះស្រាយ AI/ML ជាមួយ AWS។ គាត់មានបទពិសោធន៍យ៉ាងទូលំទូលាយក្នុងការរចនាការរៀនម៉ាស៊ីនពីចុងដល់ចុង និងដំណោះស្រាយការវិភាគអាជីវកម្មក្នុងផ្នែកហិរញ្ញវត្ថុ ប្រតិបត្តិការ ទីផ្សារ ការថែទាំសុខភាព ការគ្រប់គ្រងខ្សែសង្វាក់ផ្គត់ផ្គង់ និង IoT ។ Sovik បានបោះពុម្ពអត្ថបទ និងទទួលបានប៉ាតង់ក្នុងការត្រួតពិនិត្យគំរូ ML ។ គាត់មានសញ្ញាបត្រអនុបណ្ឌិតពីរដងពីសាកលវិទ្យាល័យ South Florida សាកលវិទ្យាល័យ Fribourg ប្រទេសស្វីស និងបរិញ្ញាបត្រពីវិទ្យាស្ថានបច្ចេកវិទ្យាឥណ្ឌា Kharagpur ។ នៅខាងក្រៅការងារ Sovik ចូលចិត្តធ្វើដំណើរ ជិះសាឡាង និងមើលកុន។

Sandeep Verma គឺ Sr. Prototyping Architect ជាមួយ AWS។ គាត់ចូលចិត្តការចូលជ្រៅទៅក្នុងបញ្ហាប្រឈមរបស់អតិថិជន និងបង្កើតគំរូសម្រាប់អតិថិជនដើម្បីពន្លឿនការបង្កើតថ្មី។ គាត់មានប្រវត្តិផ្នែក AI/ML ដែលជាស្ថាបនិកនៃ New Knowledge ហើយជាទូទៅស្រលាញ់បច្ចេកវិទ្យា។ ពេលទំនេរ គាត់ចូលចិត្តធ្វើដំណើរ និងជិះស្គីជាមួយគ្រួសារ។

Uchenna Egbe គឺជាស្ថាបត្យករដំណោះស្រាយរួមនៅ AWS ។ គាត់ចំណាយពេលទំនេររបស់គាត់ដើម្បីស្រាវជ្រាវអំពីឱសថ តែ អាហារទំនើប និងរបៀបបញ្ចូលពួកវាទៅក្នុងរបបអាហារប្រចាំថ្ងៃរបស់គាត់។

ម៉ានី ខាន់នូចា ជា Artificial Intelligence and Machine Learning Specialist SA នៅ Amazon Web Services (AWS) ។ នាងជួយអតិថិជនដោយប្រើការរៀនម៉ាស៊ីនដើម្បីដោះស្រាយបញ្ហាប្រឈមមុខជំនួញរបស់ពួកគេដោយប្រើ AWS ។ នាងចំណាយពេលភាគច្រើនរបស់នាងក្នុងការមុជទឹកជ្រៅ និងបង្រៀនអតិថិជនលើគម្រោង AI/ML ទាក់ទងនឹងចក្ខុវិស័យកុំព្យូទ័រ ដំណើរការភាសាធម្មជាតិ ការព្យាករណ៍ ML នៅគែម និងច្រើនទៀត។ នាងមានការងប់ងល់នឹង ML នៅគែម ដូច្នេះហើយនាងបានបង្កើតមន្ទីរពិសោធន៍ផ្ទាល់ខ្លួនរបស់នាងជាមួយនឹងឧបករណ៍បើកបរដោយខ្លួនឯង និងខ្សែសង្វាក់ផលិតកម្មគំរូ ដែលនាងចំណាយពេលទំនេរច្រើនរបស់នាង។

spot_img

បញ្ញាចុងក្រោយ

spot_img

ជជែកជាមួយយើង

សួស្តី! តើខ្ញុំអាចជួយអ្នកដោយរបៀបណា?