LG AI Research Open-Sources Three EXAONE 3.5 Models with Enhanced Instruction-Following and Long Context Capabilities

Models & Research

The Engineer

10 Dec 2024 · 3 min read

LG AI Research unleashes three advanced EXAONE 3.5 models, boosting instruction-following and handling longer contexts, a leap forward for both on-device apps and high-performance tasks.

LG AI Research has just open-sourced three new models from the EXAONE 3.5 lineup, building on the success of the EXAONE 3.0 series released in August 2024. These models are designed to meet a wide range of needs, from lightweight on-device applications to high-performance tasks requiring top-tier performance. Here’s what you need to know:

Technical Updates and Why They Matter

The new EXAONE 3.5 models offer significant improvements over their predecessors, particularly in instruction-following and long-context capabilities. This is crucial for developers and researchers looking to deploy AI solutions that can handle complex tasks with precision and efficiency.

2.4B Model: An ultra-lightweight model designed for on-device use.
- Key Features:
  - Runs efficiently on low-end GPUs or even in environments without robust infrastructure.
  - Ideal for edge devices, mobile applications, and scenarios where resource constraints are a concern.
7.8B Model: A lightweight model optimized for versatile applications.
- Key Features:
  - Same size as the previous open-source model but with enhanced performance.
  - Suitable for a broad range of tasks, from text generation to data analysis.
32B Model: A high-performance model targeting users who prioritize top-tier performance.
- Key Features:
  - Designed for demanding applications requiring the highest levels of accuracy and speed.
  - Ideal for research, enterprise solutions, and large-scale deployments.

Training Efficiency and Decontamination

The EXAONE 3.5 models are not just powerful; they are also highly efficient to train. LG AI Research has implemented several strategies to ensure that these models can be trained cost-effectively while maintaining high performance:

Pre-training Phase:
- Data Cleaning: Duplicates and personally identifiable information (PII) were removed from the training datasets to improve model quality and reduce infrastructure costs.
- Performance Optimization: Techniques like data augmentation and advanced regularization methods were used to enhance the models’ ability to generate accurate and coherent responses.
Post-training Phase:
- Supervised Fine-Tuning (SFT): This method was employed to fine-tune the models on specific tasks, improving their instruction-following capabilities.
- Direct Preference Optimization (DPO): DPO helps the models better reflect user preferences by directly optimizing for desired outcomes during training.

Decontamination Process

To ensure the reliability and trustworthiness of the EXAONE 3.5 performance evaluation results, LG AI Research conducted a thorough decontamination process. This involved:

Borrowing Best Practices: They adopted decontamination methods from leading global models.
Rigorous Evaluation: Performance was rigorously evaluated by comparing training and test data to identify and remove any potential biases or contaminants.

Future Plans

LG AI Research is committed to continuing its open-source efforts. They will actively seek feedback on the EXAONE 3.5 models and use it to release even better versions tailored to the needs of researchers and developers. By fostering a collaborative ecosystem, they aim to drive innovation and advance the field of AI.

Conclusion

The open-sourcing of these three EXAONE 3.5 models represents a significant step forward in making powerful AI tools accessible to a broader audience. Whether you’re working on resource-constrained devices or high-performance applications, there’s an EXAONE 3.5 model that can meet your needs.