Microsoft unveils AI model that understands image content, solves visual puzzles

March 1, 2023

Enlarge / An AI-generated image of an electronic brain with an eyeball.Ars Technica

On Monday, researchers from Microsoft introduced Kosmos-1, a multimodal model that can reportedly analyze images for content, solve visual puzzles, perform visual text recognition, pass visual IQ tests, and understand natural language instructions. The researchers believe multimodal AI—which integrates different modes of input such as text, audio, images, and video—is a key step to building artificial general intelligence (AGI) that can perform general tasks at the level of a human.

“Being a basic part of intelligence, multimodal perception is a necessity to achieve artificial general intelligence, in

→ Continue reading at Ars Technica

Comments

Top-seeded Walnut Grove and top scorer Kiera Pemberton strut stuff at B.C. basketball provincials

Will Seahawks draft a QB regardless if Geno Smith re-signs? | Locked On Seahawks

Microsoft unveils AI model that understands image content, solves visual puzzles

Related articles

Comments

Share article

Latest articles

What happened to the entry-level car?

Oregon DOJ says no criminal charges warranted in OLCC bourbon scandal

B.C. retailers up cybersecurity defences to avoid attacks

Dozens of Red Lobster locations close abruptly: List of latest closures

Raw milk fans plan to drink up as experts warn of high levels of H5N1 virus

Air Force is “growing concerned” about the pace of Vulcan rocket launches