Hacker News new | past | comments | ask | show | jobs | submit login
I told my 2nd year CS students to create a programming language (dovyski.com)
129 points by dovyski on April 13, 2012 | hide | past | favorite | 63 comments



I did something like this once many years ago for one of my wife's students, a precocious high-schooler who needed to be challenged beyond the elementary Fortran she was teaching at the vocational school. So I outlined the steps it would take to translate a toy language into a toy computer's instruction set, and reviewed it with the young man one afternoon at my office - an abbreviated compiler theory class with an eager audience of one. Tom took the big flowchart that we made home, studying and meticulously documenting it. When he brought me the documented design a couple of days later I asked him to implement it.

Tom spent awhile doing this in Waterloo Fortran, and when he presented me with the working compiler I said "Good job. Now execute the code." He blinked his surprise - until then I don't think he'd considered that the toy machine could have been interpreted. A revelation! He went back home and wrote the VM for the toy machine without any assistance; it worked well and I was as proud of him as I could be.

A couple of years later I hired Tom and a couple of years after that he lapped me, going on to better things. 30 years later I still keep a copy of his compiler and interpreter, reminding me from time to time of the bright kid who grew up to become a fine programmer and family man.


Thank you. Though you only helped one kid, someone else helped me...

...I was once the precocious high-schooler; in my History class I put a .com file in the program's directory that ran a look-alike program which didn't in fact work - on perhaps the only IBM PC in the school.

(Since DOS executed .com's first, and only the .exe if no .com was found, adding a .com was a prank and harmless, but that didn't help my case.)

It was only a few years later, me interviewing for a job at a tech company that the manager told me he was my History teacher's husband, and had cleaned up my mess.

I could only thank him for both placating my teacher and then going on to offer me a job.

In hindsight, I could have figured out that the nicest PC in the school wouldn't be there just out of sheer random chance...


Wow, great history! Thanks for sharing that.


I wish I had a professor like you. When I asked why there was no compiler class, the answer I was given was: "compilers are a solved problem; there's no point in teaching them anymore."

Maybe it's just my school though.


My high school guidance counsellor tried to dissuade me from taking a computer science undergraduate degree, stating that all the programs that are useful had already been written.

I can only imagine what he told the kids who wanted to become civil engineers.


That's jaw-dropping really... every student should have to experience the horrors of the dragon book.

*edit: I say that firmly tongue in cheek. The Dragon book is one of the most important pieces of CS literature ever written. It's a fine book, the subject matter is just difficult:)


The Dragon book may be a fine book but it is mainly a book on parsing, not on compilers. You first learn about control flow graphs in chapter 9! Most courses probably won't even get that far. The parts that are about compilation are very old fashioned, essentially a book about "how to compile Fortran to inefficient code like they did in the '60ies".

It would be great if there was a book on modern compiler techniques, including SSA, abstract interpretation, compiling high level languages, pointer analysis, compiling dynamic dispatch, garbage collection and closures, etc. Probably the closest are Appel's compiler books.


http://www.cs.princeton.edu/~appel/modern/ml/

http://store.elsevier.com/product.jsp?isbn=9780120884780

Two of the most important books I own.

Also, thank you for tearing down the dragon book -- I was going to do the same myself.

The dragon book is outdated on parsing, too. It's from a time where single-pass parsing and compilation was sometimes necessary because you didn't have enough memory (or time) to do multiple passes. It certainly doesn't have anything on packrat parsing. It is a historical book -- I would recommend it for no serious purpose (even pedagogical) these days.


> http://store.elsevier.com/product.jsp?isbn=9780120884780

I don't trust this book; it has an error that I reported to the authors twice but never heard a word back (or any note of the error in their list of errata).

Here's what I wrote to them:

  Subject: Errata for "Engineering a Compiler"
  Date: Sat, May 26, 2007 at 8:12 PM

  Hello,

  I noticed that your section on DFA minimization is labeled
  "Hopcroft's Algorithm," but doesn't seem to describe the
  algorithm explained by Hopcroft in his 1971 paper [0]. Your
  algorithm appears to be n^2, where Hopcroft's is n log n. David
  Gries gave a somewhat more digestible presentation of the same
  algorithm in a follow-up paper in 1972 [1].

  Hope this information is useful!

  Sincerely, <me>

  [0] John E. Hopcroft. An n log n algorithm for minimizing states
  in a finite automaton. Technical Report: CS-TR-71-190, 1971.
  Available online at
  ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/71/190/CS-TR-71-190.pdf

  [1] David Gries. Describing an algorithm by Hopcroft. Acta
  Informatica, 2(2):97-109. Online reference:
  http://www.springerlink.com/content/r5631549671g6251/

> The dragon book is outdated on parsing, too. It's from a time where single-pass parsing and compilation was sometimes necessary because you didn't have enough memory (or time) to do multiple passes. It certainly doesn't have anything on packrat parsing.

I'm pretty sure that all production compilers do one-pass parsing even now because of efficiency concerns. I would be very surprised if any production compiler used packrat parsing which takes O(n) memory.


I have been using Kenneth Louden's book on compiler construction and as a beginner I find it extremely helpful so far.


Some bigger schools now teach with their own set of materials that the professors put together. I think it might be precisely because of this reason.


Agreed. I'm taking a class right now with the dragon book (taught by Aho, actually - one of the authors of the book), and it's easily the best and most important CS class I've taken[1]. The main project for the class is to design a language and write a working compiler/interpreter for it - in all the years he's taught the class, apparently no group has failed to do this. The really tricky part is making a language that other students in your class will want to use, though, and that goes beyond the 'simple' issue of compiler structure!

I half-wish that there would be a way to teach a compilers class as the second class in a CS track (ie, right after the intro class), but the problem is that it just requires too much familiarity with the details in order to impose upon most beginner/intermediate students, even if a few really interested ones could potentially handle it.

It might be easier if you started off with Scheme and then went from there (a la SICP), but even then, you'd miss a lot of the more fundamental concepts assumed in the Dragon book that really just need a lot of time to be digested.

[1] Alright, there's probably a three-way tie for 1st, but it still counts. My education would not have been complete without it; I highly recommend the Dragon book to anybody who has the time and interest.


My compilers class was a nightmare (and we used the Dragon book). I had hoped a compilers course would actually teach about the full process by which a text file becomes executable 1s and 0s. Instead it was a months-long torture of parsing text. Regex to NFA, NFA to regex, regex to DFA, NFA to DFA, DFA minimization, LL(1), LR(0), SLR(1), LR(1), LALR(1), along with memorizing all the rules governing which grammars are amenable to which parser, and working through all of the preceding by hand. And then finally about a week of teach-yourself-LLVM.

I don't know if "compilers are a solved problem", but I'm pretty damned sure that the only people who need to know that much about text parsing know who they are, and would be able to reference it when needed. As an undergraduate course, it was utterly worthless.


I interviewed Professor Aho a few years back; I'm a little jealous I was never fortunate enough to take that class. It's likely to be one of the most practical classes in all of computer science anywhere.

(I won't give away the punchline for the class. I don't know how far along you are on the project.)


I took that class from the other instructor, who had us use OCaml instead of Java (as I presume Aho still uses.) The Dragon book was interesting, but, for the most part, we were able to get on from the in-class slides.

I don't think there's anything wrong with having Discrete Math, Data Structures or Computer Science Theory as pre-requisites for PLT or a compiler class. However, I doubt the necessity of OO Programming & Design and (so-called) Advanced Programming.


I did not like the Dragon book either. I feel like it taught me nothing and was difficult to read and comprehend. I had to find other sources of explanation at times when I was doing this. It seemed to be more a thought and theory book if I remember.

This was at least 6 years ago though, so my memories are a little cloudy.


I was taught compilers by one of the authors of the Dragon Book. It was one of the best courses I ever took.


Pretty sure that's just your school.

In the article, he mentions that the students don't get to compilers till 4th year. I find that a little strange. Some elementary understanding of how a high-level programming language translates to executable code is quite useful in other courses in a standard CS curriculum. At my school, in 2nd year we write a compiler for a tiny subset of C++ to MIPS assembly, and in 4th year you write a compiler for Java with a great deal of optimization.


They get some basic understanding about the compilation process at some classes through the first two years.

When I attended my CS course, in the 3rd years I got some understanding about programming language structure and learned about compilers at the end. It's cool that you can learn about that earlier.


We have a class on compilers at my university, but it's on the 4rd year. Creating a compiler/interpreter when you are just starting programming would be a big challenge.

@dovyski: it's nice to see this coming from a brazilian university -- I'm from UFABC, São Paulo


It's a big challenge indeed. About the compiler course, it's the same at UFFS: 4th year.

Thanks and nice to see a brazilian fellow around here :)


Whoaa that's pretty bad. I'm currently taking a computer architecture class in which we have to design and implement a working superscalar out of order processor in HDL. Even in that class, which is strictly focused on computer architecture, our professor stresses how important compilers are. Sure compilers have become pretty good at optimizing code but there is so much room for improvement especially as we increasingly move to multicore systems. Your professor really must be living under a rock...


Which school do you go to?

Compilers seems like a pretty standard CS course at most schools.


How the PLsy courses are split varies a lot. Where I went (Harvey Mudd), compilers was optional and not taken by most students, but there were two required Programming Languages courses, which involved parsing, ASTs, interpreters, tree transformations, and some other things you might otherwise find in a compilers course, but with less of a low-level flavor (no code-gen) and more of a PLs flavor.


The content of such a course also varies by university. I recently came across some university that did offer a compiler course, but it was all theory. Being a geek, I didn't really like that.


Most UK universities don't teach compilers for CS.


Thank you!


Sounds like a terrible place to be honest. With that mentality, in 60 years (when current compile designers are dead), nobody will know anything about compiler architecture.

On the third year at my university, there's a course centered around writing a Java compiler.


In 60 years, the best compiler writers will be computers. The singularity is near.


This is great. More professors need to be like you.

With the exception of barely 2 projects in uni, all my homework/projects were short and small scale, meaning I already knew how to complete 90% of it, and just had to learn the content from the latest lecture to complete the rest.

With this assignment I probably would have known ~40%, meaning I wold have to buckle down and actually get invested - figure out what I know and don't know, and self study to the answer. Because I wouldn't just be able to rely on my teacher and lecture notes, I would actually have to take (some) control of my learning to succeed (rather than expect the teacher and textbook to hand feed me everyone I need to know). Because I'm not hand fed everything, I would actually become motivated because my success (or failure) for once would be somewhat in my own hands. What a concept!!! Congrats.


Thank you!

You described my thoughts when I created the assignment :) Like I said, I was afraid to push them too hard. However I was denying them a real challenge, filled with surprises, learning experiences and fun.


Go check out Peter Norvig's "Write a lisp interpreter in Python": http://norvig.com/lispy.html

It's a very slick implementation. BTW, we had a similar requirement for our "Programming Languages" undergraduate course: to implement our own language with a BNF grammar, parser, etc (done in ML). It was one of the best learning experiences we had!


Thanks for the link and for sharing your experiences. Based on what you said, it looks like I'm on the right track to create a great learning experience. Nice :)


It's a neat idea, but I really want to hear an experience report on the results. I don't want to throw cold water on this, because I want it to work and I think it might, but I have a fair amount of experience here[0] and can think of a number of reasons why it might go poorly as well. All we see in the OP is, "here's an idea that's cool (and it is) and I'm optimistic that it will work."

[0]I have taught courses where intro-level students write Scheme interpreters (in Scheme, in Java) and courses where students implement interpreters for higher-level languages, and courses (well, one so far) where they implement an actual compiler (for a subset of C).


I will share the results on a future post. I understand your point, but I think the assignment will enhance their skills related to objects, classes, methods and instantiation.

That's my goal, not teaching about compilers or language structure. About that they will learn empirically.


Yup! And I didn't mean to sound so negative (sorry). I do hope it works (and want to hear about it when it does!). I've definitely gotten really excited about assignments before when I thought they were neat and that the students ought to be able to figure everything out---but which turned out to be just a bit too hard and thus demoralising rather than empowering. (If that happens, it's still not worth dropping---tweak it until it works. Once you get it working, there's probably a SIGCSE or ITiCSE paper in it for you!)


No problem :)

I really got your point now. I thought about the "demoralising X empowering" and my solution for that was to work with pairs of students. Working as a team, they can discuss and motivate each other.

When I blog about the results, I will highlight the number of pairs that completed, partially completed and did not complete the assignment.


Cool!

It's interesting that you want them to make a procedural language, although if they started with C, it makes some sense.

I wonder if you could broaden the experiment slightly by relaxing your constraints and saying they either need to allow reassignment of variables, or else support lambdas, and in exchange they can assume immutability.


Thanks!

I think a procedural language makes more sense for them at this time. They are just learning OOP, so it would be hard to implement an interpreter for that.

About your suggestion, that's cool. I think most of them are already assuming immutability and using floats/doubles as the type of all variables.


I agree with your blog post that it's a more interesting assignment than yet-another-Tic-Tac-Toe clone. I wish more curricula did this, and earlier! I think after the assignment is done it'd be worthwhile to show them that they can get all the Java-style OOPy features with just lexical closures, and I think showing Forth as a powerful, simple, and simple-to-implement language would be neat too. (Also yacc and lex deserve a mention, I like this PDF that accomplishes the goal of the assignment: http://www.scribd.com/doc/8669780/Lex-yacc-Tutorial )

If you relaxed the constraint on syntax and just stipulated Turing-completeness, it would be very amusing if someone implemented a clone of Brainfuck. ;) (Especially for the unlucky pair who has to write two programs for it!)


Thanks for the words and suggestions! About yacc they will meet that when they attend the compiler course.

I didn't see a Brainfuck clone yet, but I saw something in the same level of oddity. It was a language called "Star Wars", I guess. I can remember some commands such as "Darth Vader told to a = 1" and "Skywalker from 0 to 10" haha :)


So, from what I understand, they have the flexibility to choose their own syntax. If they're interested in doing the least amount of work possible, they could either implement a Lisp-like language or Forth. (Smalltalk is pretty teeny as well. A basic Smalltalk-y language would weigh in at 1/6th the size of Python's grammar.)


One pair of students decided to create a stack-based language. I guess they noticed it would save several lines of code and make the interpretation process easier to do.


The second thing I did after finding out about FORTH (after boggling for a bit and trying to figure out how the heck it worked without function arguments) was to go and design a stack-based language and implement an interactive interpreter for it in JavaScript. Largely just because I realized FORTH is so simple, I actually could do it.

Then my language design aspirations languished for about 10 years until I had a Discrete Mathematics class based around writing a parser and interpreter for Datalog (subset of Prolog).

Then there was the programming language design class where we implemented interpreters for various language features in Racket, but by then I forgot most of how to write parsers and (read) takes care of most of the hard part there. It certainly would've been more fun if we'd had to design our languages, rather than just implementing predefined ones.

And several computer architecture & an operating systems class that dealt with "this is what C code gets turned into", but no details on the actual process of doing the translation.

So, I guess this kind of thing has been fairly well represented in my formal education, but not very well integrated. Just disjointed bits and pieces as they're relevant for some other subject. It'd definitely be nice if they were more coherent earlier on, before tackling Compilers head on.


I see you have other students use the language at the end.

If you have a chance, it might be a rare opportunity to point out some of the innovative ideas, after all the projects are done.


Yeah, it's true. I will try to analyze and discuss the implementations at the end.


Could you keep us updated about that? Such a round-up post with interesting ideas could be marvelous!


Sure, I will blog about that in the future. Stay tuned :)


Will you be able to show off the results? I'd be really pretty interested to see what some 2nd year comp-sci students come up with.

The pipe dream part of me wants you to have them stick their results on github and work with each other, or start getting taught stuff by ... well, everyone.

My high-school comp-sci teacher is a large part of the reason why I can reason clearly about code, and he would push us in directions like this -- not as extreme, but similar in the "ok, here's an end goal that you can achieve by thinking about it and making combinations of what you know" sense.

I know a lot of programmers who don't grok compilers, even at a high level, and it's a shame -- a lot of things are a lot easier, and you open the doors to doing a lot more if you do.


Sure, I will share the results on a future post. I think the same way your high-school comp-sci teacher does, but this time I decided to break the line.

About the github idea, I like it, but my students are not familiar with git yet. I will try to blog about their main ideas and language syntax in the future.


Couldn’t commend you more highly for this—not least because it’s exactly what I would do if I were a CS professor. ;)

Interpreters, compilers, and games are the things that have taught me the most about programming and computing in general. Implementing programming languages pushes high-level CS concepts as well as firsthand understanding of how languages work. Games are a fun practical application of those concepts—embedded scripting languages in particular—plus trig, vectors, matrices, rendering, optics, audio, DSP, physics, you name it. And languages and games both provide plenty of opportunities to learn about optimisation.

I hardly know why CS is taught any other way.


Thanks!

I totally agree with you. My student's next assignments will be game-related: develop an agent to fight inside an arena and create a Guitar Hero like game.

I really think they will learn more and better if they have to accomplish that kind of assignments.


Our advanced CS class (first year) has written a scheme interpreter. We were going to port it to C, from racket, but we ran out of time in the class. I might just finish it for fun this summer.


Cool! I had a (mandatory) course on compiler construction in my second year of CS as well. It was one of the toughest subjects, but it also made me understand and appreciate programming languages so much better. Like everyone, we tried to create a language without the annoyances found in many modern languages. Like everyone, we found that those annoyances are actually really hard to avoid. It was tons of fun :)


I had the same fun when I created my own language during an also mandatory course on compilers :) Even though my goal is to enhance my student's skills about OOP, I hope they have fun creating a language.


Not sure what's unusual about this except for the fact that it was assigned to 2nd years. I had to create a compiler from scratch during my course work, and I'm sure many other CS programs require the same.

It is a great way to learn how a compiler works, but then again, in this day and age it seems less and less relevant to know that stuff, especially if you are going into a typical programming position.


We did this for our Compilers module during my second year CS degree at Nottingham university. It's a very interesting subject and I thought it gave a very good grounding in understanding new languages. Unfortunately we had to write our compiler in Java, we had the option of using Haskell but I didn't really learn get to grips with functional programming until the third year.


I am in a principles of programming languages class currently, and we are doing very similar work. We are writing interpreters, type checkers, and more for a new language our teacher has defined. We are doing all of this in Scala.


Cool! It looks like you are creating a more robust interpreter. The one I told them to implement is simpler (no type check, for instance).


I gave a talk yesterday on implementing a VM in 30 minutes. I'm not sure how well it went (it's a lot of material for 30 minutes), but I hope I at least got some people excited about implementing languages.


As a programmer, I really enjoy the idea of creating my own VM and make it able to interpret/run a language I designed. Surely some people got excited about that :)


I think this is a pretty good approach. You don't learn till you get your hands dirty. I hope kids do well.


the dragonbook is never more than an armlength away from my screen. love the idea.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: