Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: What are some "toy" projects you used to learn neural networks hands-on?
76 points by profsummergig 5 months ago | hide | past | favorite | 48 comments
NN = Neural Networks.

And if you can share Docker Compose based set-up, please do (I like Docker Compose for its simplicity).

Synonyms of "toy" include: nano, micro, games, something that can be played with on an off the shelf laptop.

Python or JavaScript preferred.




While not quite your definition of toy, I have a small deep learning rig I built myself with two 4090's, and that has been enough to train several different ~200m parameter LLMs, starting with a hand rolled tokenizer and just vanilla pytorch to experiment with different architectures. While its not going to win any benchmarks or be usable for real problems (you should just be fine tuning llama), it has been super valuable for me to really understand exactly how these things work.

I use devpod.sh and a pytorch dev container I can spin up locally, with the intention of also spinning it up in the cloud to scale experiments up (but I haven't done much of that). Still, can recommend devpods for reproducible environment I don't feel worried about trashing!

If people are interested I can throw the git repo up now, but I have been planning on finding some time clean it up and write up a really short digest of what I learned.

Above anything I can write though, I highly recommend Andre Kaparthy's youtube channel - https://www.youtube.com/@AndrejKarpathy You can follow along in a google colab so all you really need is a web browser. My project started as following along there and then grew when I wanted to train it to mimic my friends and I on some data I had of us chatting in slack, which meant some architecture improvements, figuring out how to pre-training on a large corpus, etc etc


I want to echo the recommendation of Andrej Kaparthy's YouTube channel.

Before I started watching his videos, I thought that understanding how gradient descent actually worked and what autograd actually does under the hood was unimportant - after all, I can get a working network by just slapping together some layers and letting my ML framework of choice handle the rest (and, to the credit of modern frameworks, you can get impressively far with that assumption). Andrej's Micrograd video was what changed my mind - understanding the basics of how gradients are calculated and how they flow has made everything make so much more sense.

If the classes at my university had been as good as what that man publishes on his YouTube channel for free, I would've actually finished my degree instead of dropping out.


Thank you. I love that Karpathy is a minimalism enthusiast (nanoGPT, microGPT).


I built a gradient descent visualizer in js with Svelte and TensorflowJS [0].

Also wrote about it in my blog [1].

[0] https://gradfront.pages.dev/

[1] https://blog.horaceg.xyz/posts/need-for-speed/



Thank you.


Yeah, the videos and projects here are excellent.


This is a while back, but I designed and built an AI-art installation from scratch, where I trained a GAN network to generate abstract art. The generated images was shown on a Samsung The Frame, and below the screen was a button. If you pushed the button, a new and unique piece was generated and shown on the screen.

If you're interested in doing something similar, I wrote an extensive guide of how to build it here: https://github.com/maxvfischer/DIY-ai-art.

The guide doesn't include the ML part, so that you will have to learn on your own and integrate into the project :)


This was years ago, but I experimented with running TensorflowJS on a raspi to determine whether or not to open the chicken coop door when a creature comes nearby. Essentially, take a pic of the scene, and determine "chicken or not chicken" and open the door if the former. So, I trained the model on lots of pictures of chickens for positive, and lots of common predators (snake, raccoon, fox, etc.) for negative. It worked pretty well, except for one major flaw: I couldn't do the hardware to save my life. I rage-quit when I couldn't get the motor to keep the door open.


I teach students to build v. small, uncomplicated VAEs on MNIST. No CNNs, just fully-connected layers.

We then navigate through the latent space, exploring it by e.g. interpolating between the mean vector for the number 7 and its positive and negative standard deviations. We then decode the latent vectors revealing interesting relations like "the closer you get to one stdev in the positive direction, the more the 7 looks like 9" or "7 is entangled with skew"


The "navigat[ing] through latent space" sounds very interesting. Do you have any class docs going over what you teach, or other docs/sources to learn that?


Sorry, no Docker or anything like that but I'll share my story anyway because I had no idea if it would work and I was almost amazed by the results.

I made a prototype of a Dr Mario-inspired game with a new mechanic (the blank half-pill) in Racket. I hand-coded an AI, and then I wanted to see if I could train a NN to predict what that AI would play in any given game-state. Obviously this allows me to generate training data much faster than playing the game manually.

I did get it kind-of working, as you can see in the two most recent commits on this branch: https://github.com/default-kramer/fission-flare/commits/ML/

I learned that the most important factor for me was the size of the training data, e.g. training on 50k games is way better than 10k games. As I recall, I got it to correctly predict the hand-coded AI's move 77% of the time, and when it didn't get it right it usually had a plausible alternate move. I was pretty surprised with a relatively underpowered laptop and a severely undersized data set it was able to get that accurate. (I suppose it is easier to predict a deterministic algorithm's moves than a human player's moves.)

After doing the ML stuff I decided, "okay, enough prototyping, time to turn this into a semi-polished game using Godot." Well it turns out Godot, although it is amazing, is significantly less fun than Racket. So I got this far before I got sick of it and moved on to a different project: https://blockcipherz.com/


Oops, now that I look closer it's apparent there is some work that I never committed or pushed! I'll try to resolve that...


I wrote very hacky python to play chrome://dino/ using Tensorflow in college for a class project.

I didn't know anything about RL so I wracked my brains and came up with an approach using CNNs with the goal of creating a model that could play at least as well as I could. The project was three separate scripts: collect training data, train a model, and then run that model using Selenium. The model would be trained to predict an action (jump or don't) using images of the game (state), and the training data was generated by running a script while playing the game that recorded the screen and adding what keys I was pressing at the time. The CNN was simple: alternating 2D convolutional layers and 2D max pooling layers and two dense classification layers.

First, after a couple hours, I realized my poor GTX 1070 laptop GPU would struggle with even 640x480 captures. Read some docs, did some input processing, and after a week with a lot of Googling, got things running.

However, the accuracy was terrible. It took a while but I realized my data captured where I had lost the game. I started manually deleting the last ~1 second of images from each session and it worked! What a feeling!

Since the game speeds up over time the model hit a limit quickly. I used OpenCV to literally write in numbers on the saved images to provide some kind of info about the length of the game to the game state, and it worked again!

Then I ran into a new problem- the model made it to a part of the game where black and white are inverted (the palette shifts from "day" into "night") and consistently failed. I hadn't made it that far very often so there was too little "nighttime" data. So I learned about data augmentation without knowing the term for it; with a quick script, I copied all my training data with black/white reversed, and the model ended up besting my top score by a solid margin. Never realized I could have just swapped the image colors when running the model.

It was the most fun I'd ever had with programming to that point and kicked off my passion for ML and AI. Ugly manual steps, poorly written code, using the wrong kind of model - but it was true creative problem solving and I loved it.



I work on neural nets back in the mid 1980s, so I was somewhat familiar with their structure, backpropagation, squashing functions, etc. I wanted to get back into them, so I started by rederiving all backprop formulas by hand here https://github.com/tbensky/NeuralNetworks/blob/main/Backprop....

Then I wrote up a NN from scratch in Python to train some simple vectors and even got it to train some MNIST characters https://github.com/tbensky/NeuralNetworks/blob/main/ANN/ann.... (but it was slow).

With that basic refresher, I got into PyTorch and worked on training a PiNN: https://github.com/tbensky/PiNN_Projectile (neat video of it training: https://youtu.be/0wlHa1-M7kw).

Now I'm working on understanding kernels with CNNs (this is my question, I'm making good progress on answering): https://ai.stackexchange.com/questions/46180/kernels-on-a-tr....

Having loads of fun!


https://www.mnist.org

I wanted to actually build first-hand intuition on all of the choices around hyperparameter choices, activation functions, network architectures, etc. So I've been rigorously exploring them by training and testing models off of the mnist dataset.

Coming up soon: vision transformers, depth-of-architecture on CNNs, batch size investigations, and more.

Let me know if any of you have any suggestions of things to investigate next!


Two years ago, I had a need to detect if an image was upside down or right-side up. It occurred to me that this seems like a binary classification problem so I went through the whole process of building a model from scratch, downloading data, feeding it in, optimizing the model, optimizing the data, and so on and so forth.

The final result is here: https://github.com/kevmo314/image-orientation-detection

The repo itself is probably not that insightful because it was the actual steps that taught me a lot but finding a problem that I felt like was independently solvable and then solving it with a neural net helped me go from only an academic knowledge of neural networks to being able to confidently implement them.

I can highly recommend trying to find a similar problem (one that isn't just "someone on the internet suggested I do this as an exercise") if that resonates with you since it was going from the feeling of not knowing that it could be done to a final model that taught me the most.

Also, if you're curious, the use case was I needed was a way to detect if video coming from a GoPro was upside down.


Heh, almost exactly what I wanted to do.


I built a compiler (https://bernsteinbear.com/blog/compiling-ml-models/) for micrograd (https://github.com/karpathy/micrograd/) to better understand how it worked


I did a research project on this a while back - and when it comes to understanding deep network learning rate, regularization, hidden layer effects, and activations, I don't think anything is better than [this little web app](https://playground.tensorflow.org/#activation=tanh&batchSize...)


I built a simple autodifferentiation engine [1] in JavaScript when I was first learning about backpropagation. It's a great way to learn how frameworks like PyTorch work.

[1] https://blog.jinay.dev/posts/backprop/


I did some nn stuff years ago (before the days of ReLU) then came back to it in the last year or two.

After reading many papers and catching up with the ideas, I'm doing some simple little things to learn Python and the interacies of Torch.

The first thing I made was a tiny upscaler to go from 16x16 rgb to 32x32 rgb. Next was an autoencoder to turn a 32x32 rgb into a number of bytes (128 in testing) and back.

Next up is a combination of both to autoencode correction data to correct an upscale.

It's been well worth it for learning the programming interface to the ideas. Wrangling training data has also been valuable experience for something that isn't terribly complex but is easier once you've done it a few times.


Daniel Shiffman has a fun neural network section in his (free online) book Nature of Code:

https://natureofcode.com/neural-networks/


This looks like a fantastic resource. Thank you.


I wrote a gradient-descent (not neural networks) based IK solver in the browser. It also has collision detection, customization of the arm, and a keyframe editor/timeline so you can program and play back a set of moves.

https://grippy.app/

Not sure how easily you could train neural networks on it though.

https://github.com/pickles976/LearningRobotics/tree/main/IK/...


http://neuralnetworksanddeeplearning.com/

Coded everything from scratch, first in elixir, then rewritten some parts in C.



I ported inference of couple open-source models from Torch/Python/CUDA into D3D11 compute shaders which runs on all 3D GPUs not just on nVidia.

Whisper was the first, but I can’t say I’m happy because too much complicated C++ https://github.com/Const-me/Whisper

For Mistral I decided to rely on C# as much as possible and I like the result much better https://github.com/Const-me/Cgml/


A tiny tiny LLM (essentially removing the "Large" part of "Large Language Model"). I taught neural networks to remember wikipedia articles (Actually, just one wikipedia article about horses.) and throw it back as-is by predicting the next token (when given the first token).

https://github.com/antoineMoPa/tfjs-text-experiment/blob/mai...



I wrote a NN in PHP just for fun/learning:

https://github.com/leftouterjoins/php-nn


Reminds me of somebody I knew who had a Black Hat SEO blog a long time ago.

One of the "rules of the street" is that there are certain subjects you're not supposed to post about and if you do you will get hacked.

He posted a PHP script that used a neural network to break CAPTCHAs, this brought on a denial of service attack. He called the FBI for help but with the sketchy people hanging out in his chatroom this turned out to be quite the misadventure.


Who is doing the hacking? Black hats that have already monetized CAPTCHA breaking as a service?


Could be. Such services existed then. The FBI got called in who managed to stop the attack but I don't think the FBI could ever prove who did it.


Been working on an autoencoder that converts the hidden states of transformer models into a spatial representation that can be visualized. Started more on the toy scale but now I'm trying to scale it beyond my humble 3060. Using LLMs to help with torch and such but they are limited in the details of tensor twiddling.

https://github.com/ristew/weightscan


I started AI upscaling and extending Star Trek DS9 and Scrubs to 16:9 Full HD it's mostly custom diffuser models and custom gans for upscaling. There a tons of edge cases so you need to fiddle a bit but it works better than expected. I am at 2-3 Episodes per week with 4-6 hours of work. I also started gaussian splatting environments from TNG for use in unreal it's fun :)


Read this article and slowly re-implement the accompanying code: https://karpathy.github.io/2015/05/21/rnn-effectiveness/ It's difficult and tedious but also very useful to implement back-propagation by hand.


Thank you, Karpathy has some incredible content for beginners.


Playing around with vision models like detectron 2 to do segmentation etc can be fun and practical ways to learn torch



MNIST image classification is the "hello world" of NN, I recommend it


Thanks.



Karpathy's Makemore, built in many different frameworks like JAX, Differentiable Swift, MLX


I didn't build a game but I was incredibly excited to see if neural networks could work in financial markets, so built a few systems to trade using price data.

Spoiler alert: they absolutely do not work in their vanilla form (every article that says they do is wrong). But it was a good learning experience.


Anyone else think initially Netscape Navigator? (I know, I'm old.)


Hotdog, not hotdog




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: