Skip to content

Pinned Loading

  1. OLMo OLMo Public

    Modeling, training, eval, and inference code for OLMo

    Python 5.3k 558

  2. dolma dolma Public

    Data and tools for generating and inspecting OLMo pre-training data.

    Python 1.1k 126

  3. scispacy scispacy Public

    A full spaCy pipeline and models for scientific/biomedical documents.

    Python 1.8k 233

  4. ai2thor ai2thor Public

    An open-source platform for Visual AI.

    C# 1.3k 228

Repositories

Showing 10 of 493 repositories
  • open-instruct Public

    AllenAI's post-training codebase

    allenai/open-instruct’s past year of commit activity
    Python 2,727 Apache-2.0 345 13 12 Updated Mar 1, 2025
  • OLMo Public

    Modeling, training, eval, and inference code for OLMo

    allenai/OLMo’s past year of commit activity
    Python 5,262 Apache-2.0 558 44 56 Updated Mar 1, 2025
  • OLMo-core Public

    PyTorch building blocks for the OLMo ecosystem

    allenai/OLMo-core’s past year of commit activity
    Python 64 Apache-2.0 17 2 19 Updated Mar 1, 2025
  • olmo-cookbook Public

    OLMost every training recipe you need to perform data interventions with the OLMo family of models.

    allenai/olmo-cookbook’s past year of commit activity
    Python 9 Apache-2.0 5 1 2 Updated Feb 28, 2025
  • olmocr Public

    Toolkit for linearizing PDFs for LLM datasets/training

    allenai/olmocr’s past year of commit activity
    Python 3,839 Apache-2.0 235 28 16 Updated Feb 28, 2025
  • ai2-scholarqa-lib Public

    Repo housing the open sourced code for the ai2 scholar qa app and also the corresponding library

    allenai/ai2-scholarqa-lib’s past year of commit activity
    Python 14 Apache-2.0 3 0 1 Updated Feb 28, 2025
  • dolma Public

    Data and tools for generating and inspecting OLMo pre-training data.

    allenai/dolma’s past year of commit activity
    Python 1,100 Apache-2.0 126 24 20 Updated Feb 28, 2025
  • pixmo-docs Public

    Synthetic data generation pipelines for Pixmo-docs.

    allenai/pixmo-docs’s past year of commit activity
    Python 39 Apache-2.0 7 1 0 Updated Feb 28, 2025
  • rslearn Public

    A tool for developing remote sensing datasets and models.

    allenai/rslearn’s past year of commit activity
    Python 28 Apache-2.0 2 7 2 Updated Feb 28, 2025
  • ai2thor Public

    An open-source platform for Visual AI.

    allenai/ai2thor’s past year of commit activity
    C# 1,276 Apache-2.0 228 243 4 Updated Feb 28, 2025