AI Arabic Language Model “NooN”

Naseej Launches its Innovative AI Arabic Language Model “NooN” as an open-source initiative, with over 7 Billion Parameters.

Book a Demo Now

Fill the form with your information & we will contact you as soon as we can.

Edit Content
NooN-AI-Form
Interested in:

Explore Limitless Potentials

About NooN Ai

“NooN” a 7 billion parameter Arabic Language Model that enhances automated content creation and conversational AI capabilities, is tailored to the unique nuances of the Arabic language and boasts an extensive vocabulary, an advanced understanding of Arabic grammar, and a deep comprehension of cultural contexts. NooN can fluently generate Arabic text, analyze sentiments, and provide accurate responses, that promises to unlock a myriad of opportunities for developers, innovators, and entrepreneurs across the Arabic-speaking world.

Noon is an open source Arabic large language model released by Naseej to contribute to the overall Arabic language AI environment. It is designed to respond to various types of Arabic language queries.

Yes, Noon has been released as an open-source model on the (HuggingFace hub).

The aim of this release is to serve the Arabic-speaking community, particularly researchers and developers, and the Arabic language altogether

Explore Limitless Potentials

NooN Ai Key Features

  • Arabic Language Queries Tailored for the Arabic language and boasts an extensive vocabulary, an advanced understanding of Arabic grammar, and a deep comprehension of cultural contexts.
    Functionality
  • Dolly Instructional dataset Dolly is an instruction-following large language model trained on the Databricks machine learning platform that is licensed for commercial use.
    Functionality
  • ColossalAI We trained the model using the ColossalAI framework fully supports the HuggingFace library models, and implements different optimization and quantization techniques for billion-scale LLMs.
    Framework
  • Alpaca dataset Improve the performance of natural language processing models trained on this data. By removing errors and inconsistencies, the goal is to improve performance of the fine-tuned llama models
    Performance
  • 7B parameters With over 4000 Arabic data samples. The evaluation prompting used clear and carefully crafted criteria aligned with the model's training objective and the rules of the Arabic languag
    Scalability
  • Open Source The aim of this release is to serve the Arabic-speaking community, particularly researchers and developers, and the Arabic language altogether.
    Usability
  • TruthfulQA TruthfulQA is a benchmark to measure whether a language model is truthful in generating answers to questions. The benchmark comprises 817 questions that span 38 categories, including health, law, finance and politics.
    Optimization
  • Multi-Generation Formats Noon was trained with the main focus of having a model that responds to various types of instructions and questions (text generation, code generation, mathematical problems, closed/open-book questions, etc.)
    Functionality

Power in Numbers

0 B

Billion parameters

+ 0

Data Records

+ 0 M

Million Word

0

NooN AI Score

Clients We're Proud of

Frequently Asked Questions

FAQ

A language model is an artificial intelligence (AI) system designed to generate human-like text based on the patterns and structures it has learned from training on a vast amount of text data.

Language models use a statistical approach to predict the likelihood of a word or sequence of words appearing in a given context.

They learn patterns, grammar, and
semantic relationships from the training data and generate text by estimating the most probable next word(s) based on the input context.

Language models have various applications, such as machine translation, text generation, chatbots, question-answering systems, summarization, and more.

They aid in generating human-like text and enhancing natural language understanding
tasks.

Some well-known large language models include OpenAI’s GPT (Generative Pre-trained Transformer) series, such as GPT-3, GPT-4, and so on.

Other examples
include Google’s BERT (Bidirectional Encoder Representations from Transformers)
and Facebook’s RoBERTa.

Language models are trained using large amounts of text data from sources like books, articles, websites, and other textual sources.

The models learn by predicting the next word in a sequence of words, capturing the statistical patterns present in
the data.

Yes, language models can be trained on multilingual datasets and are capable of understanding and generating text in multiple languages. However, their proficiency Private and internal to Naseej. may vary depending on the languages they were trained on and the diversity of the
training data.

Noon is a 7-billion parameter model, making it the largest Arabic language model to
date

Noon was trained to respond to various types of instructions and questions, including text generation, code generation, mathematical problems, and closed/open-book questions.

The training data for Noon is a combination of Arabic datasets covering multiple tasks, particularly the ones mentioned above. Overall, the datasets consist globally of over 110 data records, adding up to over 11 million words.

Noon was trained on 8 A100 GPUs using distributed multi-GPU training.

The current version of Noon with 7B parameters (Noon-7b) was evaluated using OpenAI’s GPT3.5 Turbo model on a set of over 4000 Arabic data samples.

The evaluation prompting used clear and carefully crafted criteria aligned with the
model’s training objective and the rules of the Arabic language.

The final evaluation score for Noon was 4.07 out of 5, indicating a strong performance according to the evaluation criteria.

FAQ
FAQ
AUTOMATIONS

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam quis metus in enim congue ornare. Sed vitae leo placerat, venenatis massa at, dictum nisl. Suspendisse efficitur eros ligula, eget dapibus ex pellentesque quis.

Scalability

Proin interdum vestibulum urna eleifend feugiat. Integer id ipsum pretium, egestas augue at, suscipit turpis. Cras nec odio sit amet erat tristique malesuada nec non libero. Mauris laoreet rhoncus tempor.

COMPLEX FUNCTIONS

In nec quam auctor, aliquet ex vitae, suscipit lectus. Fusce ipsum diam, consequat nec rhoncus ut, bibendum sit amet lorem. Suspendisse nec neque nec quam dictum scelerisque et eu quam.

HOSTING & DISTRIBUTION

Cras laoreet odio eget velit tempor, ut accumsan dui elementum. Sed pharetra, ante vitae viverra lacinia, tellus massa vulputate est, sit amet interdum ligula sapien in risus.

Architecture
Open Source SDK
Read more
NooN AI - Intro
NooN AI - Introduction

Empowering Arabic developers to drive Arabic AI advancements and Arabic-based AI offerings and products.

Let’s work together

Get in touch today and receive a complimentary consultation.

NooN-AI-Form
Interested in: