The Mill CPU

I’ve seen the Mill CPU come up a number of times — maybe because I subscribed to their updates and so I get emails about their talks.  They’re getting a bunch of buzz, but every time I look at their docs or watch their videos, I can’t tell — are they “for real”?  They certainly claim a large number of benefits (retire 30 instructions a cycle!  expose massive ILP!), but it’s hard to tell if it’s just some guy claiming things or if there’s any chance this could happen.

They make a big deal out of their founder’s history: “Ivan Godard has designed, implemented or led the teams for 11 compilers for a variety of languages and targets, an operating system, an object-oriented database, and four instruction set architectures.”  At first I thought this was impressive, but I decided to look into it and I can’t find any details about what he’s done, which isn’t a good sign.  If we’re counting toy projects here, I’ve defined 5 languages, an ISA, and an OS — which is why we don’t usually count toy projects.


They revealed in one of their talks too that they don’t have anything more than a proof-of-concept compiler for their system… but they have “50-odd” patents pending?  They said it’s “fairly straightforward to see” the results you’d get “if you’re familiar with compilers”, and when more hard questions were asked Ivan started talking about his credentials.  I feel less convinced…

This sounds like a lot of stuff that’s been attempted before (ex Itanium) — unsuccessfully.  They have some interesting ideas, but no compiler, and (if I remember correctly) no prototype processor.  It bugs me too when people over-promise: Ivan talks about what they “do” rather than “plan to do” or “want to do”, or “have talked about doing”, which feels disingenuous if it’s just a paper design right now.

The more I look into the Mill the more I don’t think it’s real; I think it’ll fizzle out soon, as more people push for actual results rather than claims.  It’s a shame, since I think it’s always cool to see new processors with new designs, but I don’t think this will end up being one of them.

10 responses to “The Mill CPU”

  1. Couldn’t agree more. The Itanium comparison does come to mind big time.
    More philosophically, as many people dealing with architecture (myself included) have noticed, a lot of things look good on paper until you actually simulate the needy-greedy details.


  2. > until you actually simulate the needy-greedy details.

    ITYM “nitty gritty”? I’ve not seen that eggcorn before.


  3. You bet it’s for real! And wide open for you to see. If you’d had any knowledge in the field you’d see the violent beauty of their design. Now you can’t see beyond the FUD you are spreading.


  4. The CPU could be fabricated with an PGA of some description, it would be slower and potentially smaller in some way but prototype hardware shouldn’t be inaccessible to almost everyone. Once such a thing exists its possible to show the performance benefits with your basic compiler for the language and such. I do get in todays age that people want money before they have worked out how to make the thing they want to make and sell, but practically when it sounds like snake oil you need to do the moderate amount of heavy lifting yourself to prove your concept. That isn’t going to happen if they spend most of their time presenting it in conferences.


  5. Your skepticism is completely justified. The Mill may never reach market – we are a startup, and most startups fail; its a fact of life. Although we’ve survived for over a decade, which is pretty good for startups these days.

    But it sounds like you are less skeptical about Mill Computing the company, but more about Mill the technology and architecture. There are fewer ground to doubt that. As fast as we have been able to get the patents filed (I seem to have been doing nothing else for the last two years. I *hate* patents) we have been completely opening the kimono and showing the technical community, in detail, how each part works. Why? because we wanted outside validation before wasting another decade in something that was fatally flawed in some way we had overlooked.

    If there was any part of the public Mill that one could point at and say “See? that won’t work, because …” then the web would have been all over us. Buy you know? Skepticism we get, aplenty. What we don’t get is *informed* skepticism. In fact, the more senior and skilled the commenter, the more they fall in love with the design. Like Andy Glew said one time (and if you don’t know who that is then you are not in the CPU business) – “Yeah, it’ll work, just the way he says it will”.

    Sometimes people complain that our presentations are insufficiently detailed to fairly evaluate. Guilty as charged; they are oriented for a high level audience interested in the subject, but not for the specialist. However, if you ask for details on our forum (mill or the comp.arch newsgroup, as hundreds have, you will get all the details you want until they flood out your ears and collect in puddles on the floor.

    In these days of internet time, when idea to market is measured in days or weeks, it’s east to forget that not all the economy works that way. Building steel mills, cement plants, and yes, CPU silicon takes a *long* time and a *lot* of money. We have deliberately swapped money for time: we are a bootstrap startup, not looking for VC funding. There’s good and bad in that choice: a decade without a paycheck is not easy, but today we own it – all of it – and feel we got a good deal.

    The proof of the Mill pudding will be when there’s a product with pins on the bottom, and that won’t happen for some years yet. We learned in our first presentation not to make projections of what the eventual chip will have for numbers. Yes, we have guesstimates internally, but we’re quite sure those will be off by a factor of two. The problem is that we have no clue which direction they will be off.

    If you have the technical chops to understand a CPU design from first principles then please dig as deep as you can into our stuff and tell us – and the world – what you find. Otherwise you will just have to join us as we wait and work and see. We’ve never said anything different.



    • Hi Ivan, thanks for the response! I never expected my post to gain much traction let alone from the source itself.

      I can definitely appreciate the reasons for talking about something before it happens; we made the decision to announce Pyston and start talking about it well before it was ready. I also agree with some of the comments I’ve seen elsewhere that even if you happen to never build any chips, a fresh CPU design that’s as thought out as this will definitely influence CPU design going forward. And that might even be a large part of your goal (compared to building something just for the goal of being a commercial success), which I can definitely get behind. I think if you were a bit more clear about the state of the project and the extent to which your ideas have been validated, it could help prevent readers like me from getting the wrong idea about what you’re getting at. I think this post of mine, as well as some other peoples’ reactions I’ve seen, are somewhat borne out of feeling mislead about what exactly The Mill CPU is at the moment.

      But regardless, you’re attempting something crazy ambitious and game-changing, which I have to respect and root for 🙂


  6. It’s nice to see something new in the field but static scheduling has failed to deliver performance several times in the history. Take the i960 as an example. In the real life every fourth instruction is a jump.
    Executing code in two directions? Doubles the chance of cache miss.
    Single address space? Good luck with implementing fork call.
    Post-compilation on the target? Fine, if you never need to debug…
    Overly complicated architectures don’t seem to ever succeed and no compatibility with existing software makes me rather skeptical. Anyway, good luck!


    • Executing code in two directions may double the chance of a miss, but executing out of two separate caches lets you double I$ size without increasing latency from speed-of-light delays, which halves the chance of a miss.

      Also, further talks have revealed some of the things the mill can do:

      – Pack short loops into a single parallel instruction, yielding a loop throughput of one iteration per cycle.
      – Use smear operators to effectively vectorize while loops.
      – Use selection operators to reduce the amount of branches needed.
      – Save caller registers lazily in parallel with callee code, making functions cheap.
      – Implement fork() with what’s basically a segment offset into virtual memory.
      – A bunch of ops that make stack frames cheap.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: