Last Updated: 25 Feb 2024

A journal of everything I've accomplished as a Software Engineer interested in learning more about AI.

2024

February

February was a rather productive month. I spent some time working with basic hyper-parameter optimization using Optuna for the first time and learnt about how to run a simple experiment. I managed to run 3 more sessions of the paper club, where we covered

  • BERT
  • T5
  • Self-Rewarding Language Models

I also published a new article on writing better python code on some of the key changes in the way I write python code after working in python for the last 5 months.

January

January was a slow start! I had the opportunity to spend it in Germany for the first part and I really enjoyed being able to enjoy some time with family.

I've been delving deeper into open source models and experimenting with Cohere's re-ranker. That's how I learnt about batch processing data when Nils Reimer started correcting my usage on twitter and discovered that you should not be using a re-ranker model to evaluate textual similarity. Ended up also learning about these new cool metrics such as AUC and Precision to evaluate my results which has been interesting.

I also started a paper club under Latent Space to help increase interest in LLMs starting with the Attention Is All You Need paper.

2023

December

In December I slowed down the stuff I was doing and took a slight holiday. I ended up experimenting more with Text to UI platforms and documented how I was able to put together a small front-end demo using MagicPatterns in a short tweet breaking down the specific steps that I took and what the intermediate products were.

I also started on some small projects which I expect to be finished and shareable in 2024 March so stay tuned!

November

I started working on the Instructor library and published two articles with them along with getting some MRs into the codebase merged! The published articles were

On the side I continued working on more agent code and deployed a small telegram bot called Conseil which tracks todos and can understand basic natural language queries to interact with the database.

So far I've read the papers

and I think there'll be a good more to come for the rest of the month

October

I've tried to split my time in two in october - doing more LLM stuff while simultaneously trying to delve deeper into the theory behind more classical machine learning.

I finished up the first 8 lectures of CS231n and also completed all of the exercises for Assignment 1.

I also continued reading more papers and have read the following two papers this month

This month I worked on two main projects

I also finally got around to finishing up a quick article on the derivation of the softmax and cross entropy loss which I worked through in CS321n. I struggled with it for quite some time and thought it might help others.

September

September proved to be a pretty productive month for me. I managed to achieve the following things

  1. I started reading more papers - I read a total of 4 new papers in September including the LLama 2 paper, Less is more For Instruction (LIMA), Anthropic's Constituitional AI paper and the new paper on the RWKV architecture

  2. I finished up two MOOCs - Introduction to Statistics by Stanford and Zero to Hero

  3. I launched a new site to collate all my notes - you can check it out here

  4. I started participating in Kaggle competitions

  5. I finally finished up a LLM red-team challenge I called The Chinese Wall. I did a small write up here which also includes a small writeup on the discord bot that I built to accompany it.

August

  • Finished tidying up a repo with notes that I'd taken down on Karpathy's course. Currently we have part 1,2 and 3 in there.
  • Read up a bit on the paper that he mentioned - A Neural Probablistic Model which mentions the use of a real-number vector to represent words. This is currently used extensively in NLP and is known as word embeddings but back then I'm sure it must have been a novel idea.
  • Played around with the new Next Auth Kysely integration and Resend and wrote a quick article here - Started working on a small tool as part of Buildspace s4 to help people prep for interviews using GPT-4 and some other models called Prep With AI which uses a bunch of the different things that I wrote about

July

I wasn't able to do as much as I wanted due to reservice commitments but I did manage to get a few things done.

  • Discovered Andrej Karpathy's Zero to Hero course and plan to start working through it through August. So far I've finished up with his intro to neural networks and I built a basic binary classifier which has ~42% accuracy using a custom neural network I coded in vanila python. Finished up with the first 2 chapters of his course and I'm really enjoying it so far.

  • Finally figured out how to deploy langchain on AWS lambda and spent my entire weekend trying to automate a 20 min task with aws sdk

June

June has just started and my plan now is to work on more applications of LLMs. I believe that using LLMs to augment my learning will help tremendously when it comes to generating new insights and finding interesting angles to explore.

The plan is to build a local LLM using gpt to be able to query and discover new insights about my previous notes and chats. I tried implementing a basic clone with memory and embeddings here but ended up getting side tracked with other ideas.

I also started experimenting with Open AI Functions and built out a simple classifier using Yake and GPT that was able to classify places that I had been to before using my reviews and other metadata ( Link )

May

I've managed to finish up Part 1 of Fast AI's course and boy have I learnt a lot about machine learning in general. The course seems to cover a lot more of traditional machine learning techniques and there's a lot which I'll definitely need to revisit. You can read my notes here FastAI Part 1