Technology

35125 readers

156 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.

Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.

Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago

MODERATORS

MinutePhrase@lemmy.ml

Alibaba releases 100+ open-source AI models and new text-to-video generator - SiliconANGLE (siliconangle.com)

submitted 3 months ago by geneva_convenience@lemmy.ml to c/technology@lemmy.ml

6 comments fedilink hide all child comments

Alibaba Cloud, the cloud computing arm of China’s Alibaba Group Ltd., today announced the release of more than 100 new artificial intelligence large language models open source as part of the Qwen 2.5 family of models.

Revealed at the company’s Apsara Conference, the new model series follows the release of the company’s foundation model Tongyi Qianwen, or Qwen, last year. Since then, the Qwen models have been downloaded more than 40 million times across platforms such as Hugging Face and Modelscope.

The new models range from sizes as small as a half-billion parameters to as large as 72 billion parameters. In an LLM, parameters define the behavior of an AI model and what it uses to make predictions about its skills such as mathematics, coding or expert knowledge.

Smaller, more lightweight models can be trained quickly using far less processing power on more focused training sets and excel at simpler tasks. In contrast, larger models need heavy processing power and longer training times and generally perform better on complex tasks requiring deep language understanding.

you are viewing a single comment's thread
view the rest of the comments