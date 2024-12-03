Amazon has a new group of foundation models, dubbed ‘Amazon Nova’, that cover a range of multimodal interactions, CEO Andy Jassy has revealed.

Speaking at AWS re:Invent 2024 in a surprise appearance for the opening keynote, Jassy revealed that, after being hard at work on its own homegrown frontier models, it was ready to release Nova.

“I'm excited to share and announce the launch of Amazon Nova, which are our new state of the art foundation models,” Jassy said to an enthusiastic response from the crowd.

Available in Amazon Bedrock, Amazon Nova comes in a variety of shapes and sizes , including four different pricing tiers and two instances that focus specifically on text and video.

Amazon Nova Micro is the most basic option, a text-only model delivering the lowest latency responses at a low cost. Amazon Nova Lite is the next stage up, a multimodal model that also comes at a low cost and processes image, video, and text inputs.

Nova Pro is more capable still, while Amazon Nova Premier offers customers the highest level of functionality and can be used for model distillation, part of a separate announcement at AWS re:Invent 2024 which will allow users to make large language models (LLMs) into smaller ones for specific use cases.

Meanwhile, Amazon Nova Canvas and Amazon Nova Reel both generate images and videos, respectively.

Jassy explained some of the benchmarking tests and revealed that, by its own estimations, it is equal to or better than several leading models in relevant size categories - including Llama and Gemini - on certain benchmarks.

“I'll just say that we used external, published benchmarks whenever we could. When they weren't available, we did it ourselves. We published methodology on our website, so you can try and replicate it if you like,” Jassy said.

For example, Jassy said the Nova Light model is equal to or better than OpenAI’s GPT-4o Mini on 17 out of 19 benchmarks, and equal to or better than Gemini on 17 out of 21 benchmarks.

Jassy described the models as both compelling and competitive and also spoke about how cost-effective they are, noting that they’re around 75% less expensive than the other weighted models in Bedrock.

The announcements didn’t stop at that, with Jassy promising a speech-to-speech model in the near future as well as an ‘any-to-any’ model - essentially a model offering full multimodal interactivity.

“You'll be able to input text, speech, images, video and output, text, speech, images, and video,” Jassy said, referring to the ‘any-to-any’ model.

“This is the future of how frontier models can be built and consumed,” Jassy said.