humaneval

Here are 6 public repositories matching this topic...

bin123apple / AutoCoder

We introduced a new model designed for the Code generation task. Its test accuracy on the HumanEval base dataset surpasses that of GPT-4 Turbo (April 2024) and GPT-4o.

nlp text-generation code-generation nlp-machine-learning humaneval llm code-interpreter

Updated Jul 6, 2024
Python

the-crypt-keeper / can-ai-code

Star

Self-evaluating interview for AI coders

ai transformers humaneval llm langchain llama-cpp ggml

Updated Feb 21, 2025
Python

abacaj / code-eval

Sponsor

Star

Run evaluation on LLMs using human-eval benchmark

humaneval wizardcoder

Updated Sep 12, 2023
Python

zorse-project / COBOLEval

Star

Evaluate LLM-generated COBOL

evaluation cobol humaneval llm

Updated May 9, 2024
Python

declare-lab / LLM-ReasoningTest

Star

Evaluating LLMs' Mathematical and Coding Competency through Ontology-guided Interventions

reasoning humaneval gsm8k

Updated Oct 29, 2024
Python

talmago / 30-seconds-of-code-eval

Star

Code evaluation with *30-seconds-of-code* examples. Inspired by "Evaluating Large Language Models Trained on Code"

code-generation gpt-3 humaneval

Updated Jan 17, 2022
Python

Improve this page

Add a description, image, and links to the humaneval topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the humaneval topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

humaneval

Here are 6 public repositories matching this topic...

bin123apple / AutoCoder

the-crypt-keeper / can-ai-code

abacaj / code-eval

zorse-project / COBOLEval

declare-lab / LLM-ReasoningTest

talmago / 30-seconds-of-code-eval

Improve this page

Add this topic to your repo