The Progress
A journal of everything I've accomplished as a Software Engineer interested in learning more about AI.
November
I started working on the Instructor library and published two articles with them along with getting some MRs into the codebase merged! The published articles were
- Smarter Summaries w/Finetuning GPT-3.5 and Chain Of Density
- Good LLM Validation is Just Good Validation
On the side I continued working on more agent code and deployed a small telegram bot called Conseil which tracks todos and can understand basic natural language queries to interact with the database.
So far I've read the papers
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model
- Training language models to follow instructions with human feedback
and I think there'll be a good more to come for the rest of the month
October
I've tried to split my time in two in october - doing more LLM stuff while simultaneously trying to delve deeper into the theory behind more classical machine learning.
I finished up the first 8 lectures of CS231n and also completed all of the exercises for Assignment 1.
I also continued reading more papers and have read the following two papers this month
This month I worked on two main projects
- Using GPT4 To generate React Components using the @shadcn/ui library
- Creating an Arxiv Crawler
I also finally got around to finishing up a quick article on the derivation of the softmax and cross entropy loss which I worked through in CS321n. I struggled with it for quite some time and thought it might help others.
September
September proved to be a pretty productive month for me. I managed to achieve the following things
-
I started reading more papers - I read a total of 4 new papers in September including the LLama 2 paper, Less is more For Instruction (LIMA), Anthropic's Constituitional AI paper and the new paper on the RWKV architecture
-
I finished up two MOOCs - Introduction to Statistics by Stanford and Zero to Hero
-
I launched a new site to collate all my notes - you can check it out here
-
I started participating in Kaggle competitions
-
I finally finished up a LLM red-team challenge I called The Chinese Wall. I did a small write up here which also includes a small writeup on the discord bot that I built to accompany it.
August
- Finished tidying up a repo with notes that I'd taken down on Karpathy's course. Currently we have part 1,2 and 3 in there.
- Read up a bit on the paper that he mentioned - A Neural Probablistic Model which mentions the use of a real-number vector to represent words. This is currently used extensively in NLP and is known as word embeddings but back then I'm sure it must have been a novel idea.
- Played around with the new Next Auth Kysely integration and Resend and wrote a quick article here - Started working on a small tool as part of Buildspace s4 to help people prep for interviews using GPT-4 and some other models called Prep With AI which uses a bunch of the different things that I wrote about
July
I wasn't able to do as much as I wanted due to reservice commitments but I did manage to get a few things done.
-
Discovered Andrej Karpathy's Zero to Hero course and plan to start working through it through August. So far I've finished up with his intro to neural networks and I built a basic binary classifier which has ~42% accuracy using a custom neural network I coded in vanila python. Finished up with the first 2 chapters of his course and I'm really enjoying it so far.
-
Finally figured out how to deploy langchain on AWS lambda and spent my entire weekend trying to automate a 20 min task with aws sdk
June
June has just started and my plan now is to work on more applications of LLMs. I believe that using LLMs to augment my learning will help tremendously when it comes to generating new insights and finding interesting angles to explore.
The plan is to build a local LLM using gpt to be able to query and discover new insights about my previous notes and chats. I tried implementing a basic clone with memory and embeddings here but ended up getting side tracked with other ideas.
I also started experimenting with Open AI Functions and built out a simple classifier using Yake and GPT that was able to classify places that I had been to before using my reviews and other metadata ( Link )
May
I've managed to finish up Part 1 of Fast AI's course and boy have I learnt a lot about machine learning in general. The course seems to cover a lot more of traditional machine learning techniques and there's a lot which I'll definitely need to revisit. You can read my notes here FastAI Part 1
---La Fin----