Thanks for your engagement and your past writing on LLM’s and LDC’s. I was not personally looking to apply, but if I have the right partner I would consider it. I have many thoughts about this topic in general too and would be happy to chat. I’ll dm you my calendly.
I’m already getting a AI use-case brainstorming session together for animal advocates. Perhaps this can be done for global health/development as well. I recently went to a webinar by deeplearning.ai that demonstrated how to train 2 different LLM’s in under 1 hour using a highly efficient tech stack. I think that the problem with outdated info that you mentioned in your blog post can easily be overcome by training up a targeted LLM with up to date information and then assessing it against benchmark data.
And here is a summary of the webinar I made using the following tech stack: otter.ai speech to text (STT) tool for transcribing --> GPT3.5 for summarizing the large amount of transcribed text due to it’s larger context window --> Google docs for finding and replacing typos for transcribing errors like QLoRA --> GPT 4 for summarizing on a more advanced level
~~start of AI content
The video demonstrated the process of building and fine-tuning two large language models (LLMs). It highlighted the importance of instruction tuning, which aligns the model with human expectations in terms of bias, truthfulness, toxicity, etc., and fine-tuning, which refines the model for specific tasks.Several tools and methods were mentioned for the fine-tuning process:
Dolly 15k, a dataset with 15,000 high-quality human-generated prompt-response pairs.
Open Llama, a commercial language model that can be fine-tuned.
QLoRA, a fine-tuning method that reduces the model’s complexity, and is useful for tasks with lower dimensions.
The supervised biometric tuning trader library, a tool that facilitates the fine-tuning process.
They also talked about the use of quantization, which reduces the size of weight matrices, optimizing computing resources. This is particularly useful when using limited resources such as Google Colab, which was mentioned as a viable platform for training these models.
Two methods of fine-tuning were discussed: supervised and unsupervised. Supervised fine-tuning involves using clearly labeled instructions to train the model, while unsupervised fine-tuning allows the model to learn without specific targets or labels. Both methods have their advantages and drawbacks: supervised fine-tuning requires more time to organize the dataset, while unsupervised fine-tuning can be done faster.
The presenters demonstrated the process of fine-tuning using both real and synthetic data. Synthetic data, generated by GPT-4, was used to demonstrate the process of fine-tuning a model for generating marketing emails.
The webinar concluded with the reminder to continuously monitor metrics and evaluate the performance of the models for specific tasks, emphasizing that building LLMs can be done by anyone without needing vast computational resources, especially with tools like QLoRA. They provided a GitHub repo for resources and examples for prompt engineering and fine-tuning.
This instructional video demonstrated the value of building and fine-tuning Large Language Models, and how this can be achieved even with limited resources. It provides a comprehensive guide on how to approach this complex task, and offers insights on optimizing performance and efficiency.
~~end of AI content
Please note that I have no tech background whatsoever and only recently started seriously diving into AI 1 month ago so any errors in phrasing or concepts is a result of me still coming up on the learning curve. If anyone has any corrections to the stuff I said here, PLEASE let me know!
Thanks Constance!
Let me know if you’re looking to apply and want to do something together. I have a few ideas around this area as someone that has worked in Sub Saharan Africa for 10+ years now, and actually wrote a small post around this subject ((1) Large Language Models for Development: Why Information Matters (thegpi.org))
Arno,
Thanks for your engagement and your past writing on LLM’s and LDC’s. I was not personally looking to apply, but if I have the right partner I would consider it. I have many thoughts about this topic in general too and would be happy to chat. I’ll dm you my calendly.
I’m already getting a AI use-case brainstorming session together for animal advocates. Perhaps this can be done for global health/development as well. I recently went to a webinar by deeplearning.ai that demonstrated how to train 2 different LLM’s in under 1 hour using a highly efficient tech stack. I think that the problem with outdated info that you mentioned in your blog post can easily be overcome by training up a targeted LLM with up to date information and then assessing it against benchmark data.
If you are interested, here is the full webinar: Building with Instruction-Tuned LLMs: A Step-by-Step Guide by Deep Learning AI
And here is a summary of the webinar I made using the following tech stack:
otter.ai speech to text (STT) tool for transcribing --> GPT3.5 for summarizing the large amount of transcribed text due to it’s larger context window --> Google docs for finding and replacing typos for transcribing errors like QLoRA --> GPT 4 for summarizing on a more advanced level
~~start of AI content
The video demonstrated the process of building and fine-tuning two large language models (LLMs). It highlighted the importance of instruction tuning, which aligns the model with human expectations in terms of bias, truthfulness, toxicity, etc., and fine-tuning, which refines the model for specific tasks.Several tools and methods were mentioned for the fine-tuning process:
Dolly 15k, a dataset with 15,000 high-quality human-generated prompt-response pairs.
Open Llama, a commercial language model that can be fine-tuned.
QLoRA, a fine-tuning method that reduces the model’s complexity, and is useful for tasks with lower dimensions.
The supervised biometric tuning trader library, a tool that facilitates the fine-tuning process.
They also talked about the use of quantization, which reduces the size of weight matrices, optimizing computing resources. This is particularly useful when using limited resources such as Google Colab, which was mentioned as a viable platform for training these models.
Two methods of fine-tuning were discussed: supervised and unsupervised. Supervised fine-tuning involves using clearly labeled instructions to train the model, while unsupervised fine-tuning allows the model to learn without specific targets or labels. Both methods have their advantages and drawbacks: supervised fine-tuning requires more time to organize the dataset, while unsupervised fine-tuning can be done faster.
The presenters demonstrated the process of fine-tuning using both real and synthetic data. Synthetic data, generated by GPT-4, was used to demonstrate the process of fine-tuning a model for generating marketing emails.
The webinar concluded with the reminder to continuously monitor metrics and evaluate the performance of the models for specific tasks, emphasizing that building LLMs can be done by anyone without needing vast computational resources, especially with tools like QLoRA. They provided a GitHub repo for resources and examples for prompt engineering and fine-tuning.
This instructional video demonstrated the value of building and fine-tuning Large Language Models, and how this can be achieved even with limited resources. It provides a comprehensive guide on how to approach this complex task, and offers insights on optimizing performance and efficiency.
~~end of AI content
Please note that I have no tech background whatsoever and only recently started seriously diving into AI 1 month ago so any errors in phrasing or concepts is a result of me still coming up on the learning curve. If anyone has any corrections to the stuff I said here, PLEASE let me know!