I spent some time this weekend looking into different FPGA options for potential future projects; I've been using the Spartan-6 on my Nexys3 board, and I created a simple breakout board http://oshpark.com/shared_projects/duLs3P1R for it, but I started to learn more about the limitations of staying within that single product class. The Spartan-6 is limited on the high-end, though Xilinx will happily advertise several alternative lines such as their 7-Series, of any of their Virtex chips, which can cost up to $20k for a single chip. One thing I was interested in, though, is what options there are on the lower end of the Spartan-6.
There are two sides to this: the first is that the cheapest Spartan-6 is about $11 on Digikey, and the second is that the smallest package (both in terms of pin-count and physical size) is a 144-TQFP which has nominal dimensions of 20mm x 20mm (not including lead length). You can see from my breakout board that it took me about 2 in^2 of space to fit the 144-TQFP with its connections, and I didn't even break out all the pins in order to save space. This puts the minimum cost for a one-off Spartan-6 board at around $20, which would be nice to bring down.
So, at this point I started to look into other lines. Doing some searching on Digikey, there are only a few parts that come under the $10 mark: older Spartan parts such as the Spartan-3A, and Lattice FPGAs such as the ICE40. The Spartan-3A seems pretty promising, since it's quite similar to the Spartan-6 both in terms of toolchain and electrical properties. The smallest Spartan-3A costs $6 and comes in a 100-QFP package, which is about half the size and cost of the smallest Spartan-6. I haven't gone through and created a breakout for this part, but assuming the size scales somewhat linearly, it should come in at about 1 in^2 for a total cost of about $11.
Once I started to think about this, though, I noticed that the cost driver seems to be the fact that Xilinx puts so many IOs on these parts. Maybe for "real" purposes the minimum of 102 IOs on the Spartan-6 (68 on the Spartan-3A) doesn't seem like that much, but for simple boards that I want to make (ex: VGA driver on an SPI interface) this is way more than I need. So, let's look beyond Xilinx FPGA parts and see what else is out there.
As I mentioned, the other sub-$10 FPGA parts on Digikey are Lattice parts. I don't know much about that company, but some of their parts are quite interesting: they offer an ICE40 $1.65 FPGA (which apparently costs $0.50 in volume) that comes in a 32-QFN 5mm x 5mm package, or a slightly larger $4.25 part in a 84-QFN 7mm x 7mm package. They also offer a large number of cheap BGA parts, but the pitches on them are 0.4mm or 0.5mm, and I calculated that the OSH Park design rules require about a 1.0mm pitch (also that if I use a two-layer board, 256 balls is about the max). Anyway, the $1.65 part seems interesting; it seems like an interesting competitor to the CoolRunner-II CPLD, the smallest of which is a $1.15 part that also comes in a 32-QFN package. The ICE40 lacks a lot of the features of the Spartan line, but that's probably a good thing for me since I am not planning on using gigabit transceivers in the near future. I downloaded the Lattice software to test it out; people complain a lot about the Xilinx software, but at first glance the Lattice software doesn't seem any better. I'm still trying to figure out how to program a Lattice FPGA without buying their expensive programmer; I'm sure their programmer is a standard JTAG driver, but I'm still trying to figure out how to have their software output SVF files so I can use other hardware. Overall, I haven't been that impressed by the Lattice software (the installer never finished) or documentation (there are lots of links to Lattice employee's home directories, infinite redirect loops in local help, etc), so I'm not sure it's worth learning this whole new toolchain in order to have access to these parts that I may not need.
But now that I had compared the Lattice parts to the Xilinx CPLDs, I was interested in how much you can use those parts. To test it, I took the ICE40 sample program of a few blinking LEDs and ran it through the Xilinx tools for the CoolRunner, just to get a quick comparison of the relative capacities. The sample program takes somewhere between 10% and 20% of the ICE40 part (not exactly sure how to interpret the P&R results), but it takes about 90% of the CoolRunner -- apparently the large counter, which divides the external clock into something more human-visible, is a bad match for the CoolRunner. There are larger CoolRunner options, but it seems like once you start getting into those, the Spartan-3A line looks attractive since it has way more capacity for the same price. The CoolRunner does feature non-volatile configuration memory, which does seem nice, but I don't quite understand the cases where the expensive CoolRunner parts make sense.
On the other side of the spectrum, I was also interested in larger options. Specifically, I was interested in options with the best logic-capacity-per-dollar ratio; I'm sure for some use cases you really need a single chip with a certain capacity (I guess that's where the $20k FPGA comes in), but for my purposes let's look at the ratio. To do this, I downloaded the list of FPGA prices from Digikey, and ran them through a script that divides the "Number of logic elements" field by the cost for one unit. The "number of logic elements" has different meaning between manufacturers or even product lines (for the Spartan 3A, it's the number of 4LUTs, and for the Spartan 6, it's 1.6x the number of 6LUTs), so it's not really apples-to-apples, but this is only a rough comparison anyway. Here's what I got, selected results only:
$228.00 with 301000 LE, the Altera 'IC CYCLONE V E FPGA 484FBGA' is 1320.2 LE/$ [overall best] $186.25 with 215360 LE, the Xilinx 'IC FPGA 200K ARTIX-7 484FBGA' is 1156.3 LE/$ [best xilinx] $158.75 with 147443 LE, the Xilinx 'IC FPGA SPARTAN 6 147K 484FGGBGA' is 928.8 LE/$ [best Spartan] $208.75 with 162240 LE, the Xilinx 'IC FPGA 160K KINTEX-7 484FBGA' is 777.2 LE/$ [best kintex]
Those are some pretty cool parts, but looking at the packages, unfortunately I don't think I'll be able to use them. I have a reflow toaster that I've had some mild success with, so I feel like BGA parts aren't off-limits as a whole, but these particular packages are definitely pushing it. Luckily, these are 1.0mm-pitch parts, which means that according to OSH Park design rules we can fit vias in the ball grid, but unfortunately we won't be able to route between those vias that we make! So we're going to have to not have vias for every ball; regardless, if I use a two-layer board, I'm not sure how many of the signals I'll actually be able to route out of the grid. So let's rule anything larger than a 256-ball BGA (the smallest kind for most families) as off-limits. Here's what we get:
$115.11 with 101440 LE, the Xilinx 'IC FPGA 100K ARTIX-7 256FBGA' is 881.2 LE/$ [best] $34.25 with 24051 LE, the Xilinx 'IC FPGA SPARTAN 6 24K 256FTGBGA' is 702.2 LE/$ [best spartan] $39.50 with 24624 LE, the Altera 'IC CYCLONE III FPGA 144EQFP' is 623.4 LE/$ [best non-bga] $23.96 with 14400 LE, the Altera 'IC CYCLONE IV GX FPGA 148QFN' is 601.0 LE/$ [best qfn] $15.69 with 9152 LE, the Xilinx 'IC FPGA SPARTAN-6 9K 144TQFP' is 583.3 LE/$ [best xilinx non-bga, and the one I made a breakout for]
Unfortunately it seems like you really do have to go to BGA packages if you want to use anything larger than 25k logic elements, so it looks like my plan might be to first test my ability to use this BGA part by creating a Spartan-3A breakout board, and then using the Artix-7 256FBGA part when I want a large FPGA.
I've recently been intrigued by the idea of using CPLDs in my designs -- there have been lots (maybe 2 or 3!) of times that I've wanted a specific set of glue logic, only to find out it doesn't exist or the output polarity is wrong, and I have to resort to using multiple 74xx chips. A CPLD seems like a natural solution to this: as mini-FPGAs (perhaps more correctly, FPGAs are mega-CPLDs), you can configure them to perform a large range of combinatorial logic, and you can get exactly the "part" you want. This means both the exact logic behavior, but also the more-or-less exact pin placement, which I'm increasingly discovering the importance of. CPLDs are pretty cheap, starting at about $1 on the low end, which is the price of two 74xx chips. Clearly, they're going to be more complicated than just snapping some off-the-shelf parts in, but it seems like they should be a relative shoe-in.
This is the story of me getting started with the Xilinx CoolRunner-II cplds.
The first issue is that CPLDs, at least CoolRunner-IIs, only come in SMD packages. Thankfully the situation isn't as bad as FPGAs, where almost all parts come in BGA packages, but it means that I needed to find a TQFP adapter for prototyping (part of the reason why I wanted to make my own). Here's a picture of the 0.8mm-pitch chip soldered onto the adapter; I was worried about the extra-long adapter pads making this hard to solder, but it turned out to be fine:
Here's the rest of the circuit this is a part of:
There's not much on the breadboard in the middle, just a 1.8V voltage regulator, and some level-shifting circuitry (more on that in a sec). On the left is one of my "activity monitors" that was actually useful for debugging (good thing I've made 20+ of them for testing purposes!). On the top is a little ATmega328 board serving as a JTAG programmer.
The first hiccup I encountered was with voltage levels. The CoolRunner-II has four supply voltages: a 1.8V "core" voltage, and three adjustable voltages (two for the two IO banks, and one "auxiliary" for the JTAG interface). I was prepared for the idea of the cpld requiring a lower voltage, even though it would mean adding a 1.8V and 3.3V supply.
Once I started to look into it further, however, I uncovered more issues. First of all the CoolRunner-II is *not* 5V tolerant, even with current-limiting resistors (ie there's a voltage limit separate from a current limit). Unlike most parts I've dealt with (such as the ATmega328), the CoolRunner-II doesn't have ESD protection diodes, which automatically clamp the input voltages into typically-acceptable ranges. Although you have to worry about the current you send across those diodes, this means you can interface a higher-voltage signal to a lower-voltage part simply by putting a current-limiting resistor in series. Unfortunately, this doesn't work for the CoolRunner-II, which means that you have to use some other method, typically involving at least two discrete parts, or a fraction of a different IC that can do this level-shifting. For most uses this isn't that bad, but for the purpose of reducing the "glue" complexity in the system, it's a step in the wrong direction.
Let's ignore this for now; my thoughts are that I'll probably move designs involving cplds to using a main supply voltage of 3.3V. As a sidenote, technically running an ATmega at 16MHz and 3.3V is out of spec (ie overclocking), though reading on the internet this seems to be a common thing that people don't report issues with (the issues probably happen at one of the temperature extremes). For the testing, I wired up some resistor divider down-shifters and a simple NPN up-shifter, since I was running all four voltages at 1.8V.
Now that I had the chip powered up, the next step was to get it to do something. The CoolRunner-II, and most of Xilinx's (and I presume other peoples') parts are programmed over a 4-wire interface called JTAG. I won't really get into the details of JTAG, but for a single chip it looks pretty much like SPI, but the benefit is that it's designed for hooking multiple chips into a chain, so that you only need 4 wires regardless of the number of chips you have (in theory).
When I encounter a new protocol, I usually put a simple shim program on my ATmega328, which listens on the serial interface and allows a computer script to control all the IO pins. This means I can use Python on my computer, avoiding recompiling+reprogramming, plus giving me more control and visibility into the program. The downside, of course, is the overhead, but for trying out new protocols that's not as important. You can find the code here, but in essence it allows you to write the "Blink" Arduino sketch like this:
controller = DebugController("/dev/ttyUSB0", br=500000) controller.pinMode(13, "output") while True: controller.digitalWrite(13, 1) time.sleep(0.5) controller.digitalWrite(13, 0) time.sleep(0.5)
Using this, I wrote a very simple Python script that let bit-bang'ed the JTAG protocol and was able to get the chip to respond! It took me a few tries to get the bit timings exactly right (ie when exactly a bit is read by the cpld, and which clock cycles the cpld's outputs correspond to), but I was able to eventually read the part IDCODE out.
The next step was to actually get this thing programmed. I won't go into how to write the program for it, but suffice it to say that Xilinx's tools are able to produce a "SVF" file, which is essentially a list of JTAG operations that will result in the chip being programmed. Luckily, SVF is a very simple format and the main hard part in writing a SVF reader+player is to understand JTAG.
I only implemented the subset of SVF that the Xilinx tools output, so it was surprisingly manageable to write the player, which you can find here. I excitedly started running it on the full programming file, and... it took 5 minutes to program the CPLD. The obvious thing to do would be to port the SVF player to the ATmega328 and just send the SVF file down instead of the bit-bang instructions, but I didn't feel like doing that; after some performance debugging, I found the issue was that I was waiting for the ATmega328 to respond to each command before issuing the next one, so I changed it to only block when the SVF specified to verify the output, which tends to be at end of long transfers. I was able to get the programming time down to 25 seconds, which is starting to be bearable, but if I do end up using CPLDs I hope I have time to rewrite the programmer.
Anyway, at this point I had a CPLD programmed as a single XOR gate, and it was something of a rush to see it finally respond when I changed the inputs :)
Improving the build process
One thing I've always been slightly annoyed with when it comes to Xilinx development is how tedious it can be to do a compile-and-reprogram cycle. For FPGAs, it involves double-clicking a small list item in the ISE interface, waiting, then going to the Impact program, and clicking another rather small button. This is worse with new custom programming step, since I have to use Impact to generate the SVF file, then go to the shell and program the device.
I thought about adding support for my programmer in Impact, or creating some sort of custom script in ISE (which I'm vaguely aware is possible), but this quickly turned into a dead end. Instead, it was much simpler to simply take the commands that ISE itself runs (with some help from this site), and put them into a Makefile so that I can live on the command line. Again, it took a couple tries to get right -- if you don't erase the CPLD first (the default is to not erase) it won't program correctly -- but eventually I had a working Makefile that could all-in-one program my cpld with only one command and no mouse clicks. You can find the example project here.
Now that I was confident in being able to use the CPLDs, the next thing I wanted to do was create a simple breakout board with some switches and lights. Here's what I came up with:
It's definitely not as full-featured as a real CPLD breakout board, but I just wanted to throw something together quickly for testing. It takes a 3.3V supply voltage, and has an on-board 1.8V regulator. It runs the two IO banks at different voltages, one at 1.8V and one at 3.3V. There are eight slide switches and eight leds, which should hopefully be better for debugging than moving wires around and looking at the multimeter.
Update: I just noticed that I have the two IO banks backwards. This mainly means that the voltages on the two banks are swapped, which means the LEDs won't work without some hot-fixing.
One thing I want to do with this is create a simple ATmega project board, but with a CPLD as configurable router. Most of my projects only involve a few IO pins on the ATmega, but those pins tend to differ project-by-project. I tried to think of ways to reduce the pin count on the board, such as by connecting multiple pins to the same pin header, but using a CPLD seems like a way around this.
Also, since CPLDs are somewhat like miniature FPGAs, I think I'm ready to put together my first FPGA board. But first, my BLDC driver boards have arrived, so I'll probably play around with one of those.
I've done a series of circuit board layouts of my simple "Activity Monitor" circuit, at first to teach myself PCB design and try out getting them made, and lately to test different different pcb limits, primarily miniaturization. In my last post about it, I talked about putting together a board I thought was high-density, and how I submitted a new design that was even more dense. Well, I got back the boards today and here's how it went. First up, here's a picture comparing the four iterations of the boards:
The left-most board is my first attempt, and the first circuit board I ever created. It uses through-hole components exclusively, and measured 1.69" x 2.55". The green boards are my second attempt, my second circuit boards ever, and my first experimentation with surface-mount components. I could have made this smaller if I wanted but the goal for that iteration was to try out SeeedStudio (hence the non-purple color), and to test smd components.
The next pair are my first attempt at miniaturizing the circuit, using all surface-mount components and taking out "extras" like the mounting holes and second (duplicate) pin header. I also bumped up the number of signals from 4 to 6, so there's actually about 50% more logic in this board than the first two iterations. Since this was such a reduction from the previous iteration, I didn't want to shrink it any more than this, despite thinking that I could. Here's a close up of the most recent iterations:
As you can see, I made the fourth and final rev smaller mostly by taking out the spaces between the components, but also by breaking symmetry and hand-routing the board, allowing me to pack the components this tightly, getting down to a final size of 0.85" x 0.48". Here are pics of the soldered board:
Sorry for the poor image quality -- I had to take these through my cheap 10x loupe to get any reasonable level of detail.
There are a few things I learned from this: the first is that average density doesn't matter at all; all that matters is local density (ie space on the other side of the board doesn't make the current side any easier to solder). When I was making these I was worried about "how small the board is compared to the number of components", but this really doesn't matter compared to how close any two individual components are.
The second thing I learned is that the order in which I soldered the components is even more important at this scale. The method I've settled on for most boards is 1) tin one pad for each component 2) pick out all the components for a type and attach them to their pads 3) go back and fully solder all the components. This didn't work out so well on this board, since by the time I had placed the 1206 capacitors, the TSSOP parts were very hard to get to. Instead, I fully-soldered the TSSOPs first, which made it much more feasible, though it through off my method.
This worked fine for assembly, but not for any rework: I wanted to change some resistors out (the new LEDs I tried were too dim), and I resorted to pulling off multiple parts in order to get access to the resistors I wanted to remove (if I had solder tweezers maybe this wouldn't have been an issue).
The last thing is that I was somewhat validated in worrying about putting the pads close together: when I tinned a specific pad, the solder ran into the next one. It turns out that these were connected by a trace so it was fine, and perhaps the trace is what allowed the solder to run, but overall I am now more worried about that.
So in conclusion, I feel confident that this density is hand-solderable, especially now that I have 0.012" solder and a 0.4mm soldering iron tip, but I'll probably scale back somewhat, especially around "tall" items like the 1206 capacitors (which I'm not planning on using anymore anyway).
I had messed up one of my circuit boards, and in the debugging process I wanted to eliminate the external crystal as a problem (since there are so few other components in a bare AVR circuit). I looked at the ATmega328 manual, and found that it features an "Internal 128kHz RC Oscillator", which sounded safe enough. I pulled out my Arduino-based ISP programmer, and set the appropriate fuse to select that clock.
The first thing that I realized after doing this is that 128kHz is 128 times less than 16MHz, so that triple blink that happens when you start up your Arduino, that usually only takes about a second, will take on the order of minutes in this mode. I never had the patience to actually sit and wait this long, but the next option in the Clock Sources list is a "Calibrated Internal RC Oscillator". I had stayed away from this since I didn't want to calibrate it, but it seems like you don't have to calibrate it yourself if you're willing to tolerate a 10% frequency uncertainty.
So the ATmega went back into the programmer, but then I got a message along the lines of "invalid device signature: 0x000000". Uh oh. I knew that this was a possibility: for some reason, the ATmega parts use the programmed clock source when in programming mode, instead of defaulting to an internal source, which means that it's possible to program your ATmega into a mode that it is no longer programmable. I knew of this issue, so made sure to not set it to "external clock" mode, so I was surprised that it was still unprogrammable.
After doing some researching, I learned the problem is there is a maximum programming speed based on the clock speed of the target ATmega, and my Arduino-programmer was exceeding it.
Slowing down the programmer
Ok, time to dive into the ArduinoISP source code. It's using a lot of registers that I'm not familiar with, but one that looks promising is the "SPCR" register in the spi_init() function. The ArduinoISP sketch uses the hardware SPI support, and SPCR is the SPI Control Register. Looking at the docs, the bottom two bits of this register control the SPI clock speed, which seems promising -- but the sketch already has them set to the minimum setting of 1/128 of the cpu clock. I forget the exact minimum ISP clock, but it's smaller than the target's cpu clock, and since the two ATmega's clocks were off by a factor of 128, this wasn't going to work.
One thing I debated doing was to reduce the clock speed of the programmer ATmega. If I had alternate-frequency crystals, it would have been decently easy to replace that, though I'm not sure what other effects that would have on the system (would probably screw up the baud rate calculation for the serial port). But, alas, I didn't, so the option would be to reprogram the programmer to use a different clock source, and I wasn't very confident in my ability to do that.
Instead, I decided to modify the programmer's firmware. As I mentioned, the ArduinoISO sketch uses the hardware SPI, which was already clocked as slow as it can go, so I'd have to come up with an alternative SPI implementation. Normally, the difficulty in this would be maximizing the potential clockspeed, but in this case that's not an issue and bit-banging the protocol seemed pretty easy. It probably should have been, but I messed up some of the bit twiddling and I started to run into how much harder it is to debug firmware than it is to debug software running on your primary computer.
Anyway, I was eventually able to get it to work, though the programming was painfully slow (several minutes). If you have the misfortune of getting stuck with a 128kHz ATmega, you can use my modified ArduinoISP code to rescue it.
Since I didn't get the parts I wanted to play with this weekend (apparently Digikey is "upgrading [their] systems in the warehouse which has delayed some orders"), I got around to building something I'd been meaning to: some surface-mount (smd) adapter /breakout boards. So far I've been trying to pick components that have both through-hole and surface-mount variants, so that I can prototype with through hole and then make a board with surface-mount, but as I go I'm finding more and more chips which are only supplied in surface-mount packages (for instance, CPLDs/FPGAs, or higher-end microcontrollers). A common solution to this is to obtain adapter boards that you can solder your smd component to, which breaks out the individual pins to a more prototyping-friendly format (usually 0.1"-pitch pins). There are a lot of smd adapters out there (proto-advantage.com has a large variety), but I felt like I could do better than what's out there, both in terms of functionality and price.
The main thing I want from an smd adapter is flexibility; perhaps there are some people who want to stock boards for every package and pin count out there (proto-advantage sells hundreds of varieties), but I'd like to have a small number of types of boards, since the whole point of the boards is to have them in advance so I don't wan't to stock too many types. I really like what Dangerous Prototypes did with their xQFP Breakouts -- each board can handle QFP parts with anywhere from 32 to 80 pins. One thing I thought could be better is to have breakouts on both sides of the board; their design leaves the back empty, which seems wasteful. So the first board I put together is just taking two of their layouts (the 0.5mm- and 0.8mm-pitch versions), adjusting them slightly and combining them on opposite sides of the PCB:
I had difficulty doing this kind of manipulation in Eagle until I discovered a couple things: you can have a board layout without any schematic, and the Group tool has a mode where you define the group by an arbitrary polygon (instead of a rectangle) which lets you get exactly the parts you want. That second feature came in handy since the 0.5mm and 0.8mm versions had different board sizes, so I had to stretch the 0.5mm version, which involved moving one quadrant at a time (not a rectilinear group).
I'm not sure why the design should stop at 80 pins; with the 1.96" x 1.96" SeeedStudio size, should you theoretically be able to fit 136 pins around the outside of the board, assuming the routing is possible. This is just slightly too small for my Spartan-6 FPGA I want to break out (the smallest package is a 144-TQFP), but it looks like there are 100-QFP chips out there and it might be worth supporting them. Then again, I'm not confident that I could consistently solder an 80-TQFP to this breakout and have it work, so maybe the limiting factor is assembly, not the pcb itself.
I'm also hoping that this beard will handle QFN parts as well; I imagine that if you were to design a similar board but for QFN parts, it would look quite similar, though the part would probably be narrower.
The second board I made is essentially the same idea, but for different packages: TSSOP and SOIC. I tried to make the board narrow enough that it could theoretically be inserted onto a breadboard, though I think I left the margins a little too wide to make that work well. The narrow size meant I went with smaller (6-mil) traces, which might have been a mistake as well. Anyway, here's the board:
I'm less confident about this one, but considering that my current option is to "dead bug" the chips by soldering wires directly to the pins, I'll be happy once I have these which will probably take about three weeks. I've posted these designs, along with pretty much everything else I do on this blog, to my github.
I recently splurged and bought this $30 Target toaster oven ($36 online) to try my hand at reflow soldering. I'd be holding off on this since most of the articles I saw online talked about hacking into the mains-level toaster oven circuitry to add custom temperature controllers -- didn't seem worth it. But then I found this guy who seems pretty serious, who blogged about how he is able to reflow BGA's with an uncalibrated toaster oven. Also, I heard about OSH Stencils, which offers solder paste stencils for the low price of $0.60 per square inch (most other places I've seen are an order of magnitude more expensive). Newly-encouraged, I picked up a toaster from Target and started some experimenting.
My first test was with an old board that I had sent to Seeed Studio to try them out; it all went well, but they sent me 10 boards so I have a bunch of spares. At some point in the past I had bought this solder paste in anticipation of wanting to do this at some point; this is a 0.5cc syringe, which is much smaller than the 5 or 10cc syringes that seem standard, but it was only $3. Here's the board with the paste applied:
And a closeup:
Clearly, I didn't do a very good job of this. I didn't want to be too OCD at this point because I wasn't sure how the rest of the process would turn out, so I just forged ahead and placed the components:
Here are two pictures of the pcb inside the toaster; you can see that the paste has changed from a pastey grey to a milky white, which I think is due to the flux liquefying:
And then the solder went through another change, which looks like the solder melting:
I tried to copy the reflow profile from the blog post I mentioned earlier, but I found that my toaster did not heat the pcb as much as his apparently did. One thing that might have been affecting this is that I had left the "drip tray" in the toaster, which probably blocked a large amount of the heat that came out of the bottom heating element. In fact, the solder didn't melt at all until I had turned up the heat to "broil" setting and the top heating element came on.
Luckily, it seems like the solder is quite forgiving, and even though I left the board in the toaster for far longer than the recommended profile, nothing terrible seemed to happen. I haven't checked the components, though; I'm going to have to test on a board that is more easily testable.
I went pretty quickly on the paste application + part placement, since I wanted to get through the first iteration quickly, so I'm actually pretty happy with the first results despite how they look:
Yeah, some of the parts are completely not on their pads, but I feel like this is definitely a demonstration of feasibility, though there's a lot of tuning to do.
My next test was on the next iteration of this board, which is several times smaller, and has components on both sides. Once I pulled out this board, I realized there's no way that people use these syringes to actually apply the solder:
The nozzle is several times wider than the pads! Apparently you're supposed to put a needle tip on the top... live and learn. Anyway, the bare syringe actually worked for the 1206 capacitor pads, but for the 0603 leds and resistors I'd been using a hobby knife to apply small amounts at a time. But that quickly broke down when I attempted the TSSOP parts, and I resorted to applying paste across all the pins and hoping for the best:
Surprisingly, some of the pins do in fact have their own solder connections, though there are clearly a lot of bridges:
I definitely used way too much solder, but it does seem like the process is fairly robust to these kinds of issues.
The last test was to reflow some of the components on the top side of the board, and see what happened to the bottom. I was also interested in tweaking the reflow profile. Here's what I put into the toaster (I used the bottom board as a raft since the top board was only barely larger than the grill spacing:
You can see that I was pretty rushed about this too; the top pad, second from the right, definitely has too little paste on it, and the resistor ended up not sticking. That wasn't my main focus though; from this test I learned that 1) I really do need to turn on the broiler setting to get the solder to melt, and 2) the bottom components seem fine even after the second reflow. There was a fair amount of fumes when I opened up the toaster oven -- I'm hoping it was just the flux. I was actually surprised how much the whole process smelled; I use a fume extractor+fan combo which means that I get almost none of the fumes when I hand solder.
I'm pretty happy overall with the accessibility of this method: I spent about $40 and was able to get some proof-of-concepts despite rushing the whole process. The downside, though, is that it was definitely more work and time than hand-soldering these components would have been, especially for the 2-terminal passives. There are a couple things I want to do to continue experimenting:
- Order some stencils from OSH Stencils, which should hopefully increase the speed of the method
- Get a syringe tip so I actually have a chance of applying the paste by hand
- Create some process test boards that let me test the process more quickly but with less labor
- Once I do that, reflow a real board and test to make sure it actually works and I didn't damage the components
- Get some DFN/QFN/BGA parts, and see if I can start using those
But again, I feel like I can report that this method is feasible and not just in "some guy on the internet did it once" territory.
For the past couple weeks I've been working on a new project: implementing a brushless motor (bldc) driver, with the distant goal of building a quadcopter from scratch. Sidenote -- I'm still narrowing in on what I consider to be doing something "from scratch". BLDC drivers (apparently called "ESCs" in the RC world) seem to be easy to buy, but I wanted to build them myself. I'm having the pcbs made by OSH Park, but the pcb fabrication isn't something that I feel like I need to do myself; same goes for the motor construction. And I think this can change over time; I bought a cheap quadcopter frame for testing, but I plan to eventually build my own frame.
Anyway, there are a couple things that I found challenging in designing the driver:
- Learning the theory of the control -- it's not as simple as "simply apply a voltage and it goes", like the brushed DC motors I used for my simple robot. There are several good resources out there on this, though; somewhat surprisingly, though I guess it makes sense once you think about it, the best references are by different IC manufacturers who try to help you build your BLDC driver using their products. Anyway, here were the ones I found most helpful:
- Parts of the control algorithm -- specifically, the back-emf sensing required for sensorless control -- only work at high speeds, which makes the system hard to debug.
- It takes a surprising amount of power to lift even moderate weights using propellers. The motors I bought are 300W -- by comparison, the motors on my previous robot were sub-1W. There are two things that this means: a separate supply voltage (12V), and much higher currents. For my breadboard, I targeted 1A current draw, but for my eventual quadrocopter I'm targeting 20A.
I started off by building a breadboarded prototype:
I'll be the first to admit that the circuit isn't very clean; I didn't know what I needed to build when I started off. Here's a simple diagram of the major functional blocks on this breadboard:
The alligator clips are connected directly to the metal tabs of the low-side mosfets, and go off to the motor on the right which is off-picture. I'm using a simple 12V, 1A "wall-wart", which I feed through my multimeter in 10A mode to get current measurement. I also bought a surprisingly-handy barrel plug switch, since I found myself cutting the 12V supply a large number of times.
I used a standard ATmega328, like you would find on an Arduino. I'll talk more about the code that's running on it in a sec.
One thing that is apparently important for sensorless bldc control is that you are supposed to drive both the high and low sides with the same PWM signal. At first, when I did no back-emf sensing, I used a "high-side on, low-side pwm" (called PWM-ON) system which worked pretty well, but then I read somewhere that this won't work for sensorless control. I haven't validated this for myself, but to be safe I added this part of the circuit and haven't tested taking it out.
This section consists of two 7408 quad dual-input AND gates; I take the 6 signals from the ATmega and AND each one with the PWM signal. I'm kind of bummed that I'm using two 14-pin chips for this function, since it feels like this should be a fairly standard thing to do. In fact, I found a chip that does almost exactly what I want: the 74139 dual 2-to-4 decoder. Here's a screenshot I grabbed from the datasheet:
Each decoder features two selection inputs and an enable input, and has four outputs. This sounds perfect: since I only want one active signal per high/low side at a time, each of the sides gets one of the decoders and I'll wire up two selection inputs, and the PWM will be the enable signal. This was all sounding great until I noticed that the entire chip uses active-low logic; this isn't a big deal on the input side, since it's easy to generate an active-low PWM, but the output is going to be fed into a MOSFET driver which seem to all have active-high inputs. This could work by putting a hex-inverter between the 74139 and the MOSFET drivers, but then we're back up at two chips and haven't gained much. There's a similar 74239 chip which is apparently the same as the 74139 but with active-high outputs, but a cursory search revealed that this chip is not / no longer made.
This would let me save some pins on the ATmega, since you only need four inputs rather than the 6 with the AND scheme. Since there are only 6 commutation states, you should only really need 3 bits of selection input: by playing around with it, you can see that this is actually pretty close to possible, the only issue is that you need to AND together some of the outputs, which introduces yet more parts (though perhaps as simple as a resistor array). Saving the pins would be nice, but I decided to go with the AND method as I'll talk about later
I actually found this surprisingly helpful for debugging, but I wired the 6 AND-ed control signals to six LEDs. I used serial output from the ATmega for most of my debugging, but for a quick view of whether or not things were working, the LEDs were actually quite nice.
One thing I've learned to do is to use the 74AC series of chips, which can drive 24mA from each output; the 74HC chips I used earlier drove LEDs too dimly.
I didn't really understand the need for this stage at first, but this stage is essentially a series of small transistors that the microcontroller can use to control the large power MOSFETs. Some of the resources I read said that you needed these due to high gate threshold voltages, but I found some logic-level MOSFETs that could handle the current I wanted, so I thought I could get away without it. Well, on the low-side at least; on the high-side, the ATmega328 is unable to directly control a 12V signal due to its ESD-protection diodes. I had tried using the ATmega outputs to pull the high-side MOSFET's gates down, which worked, but when I tried to float the ATmega's pins by setting it to an input state, the ESD diode limited the gate voltage to somewhere around 5.5V. I used a simple 2-cent NPN transistor to do this, but I was still wondering why you would want a $1+ driver.
There were a couple reasons I found:
- The driver chips can add features, though I found that the cheapest chips don't add much that I could use for a 3-phase BLDC application.
- N-MOSFETs seem to overall be better that P-MOSFETs, in terms of both price and performance, so there's incentive to use N-MOSFETs on both the low and the high side. This means, though, that you have to drive the high-side gate pins above the highest supply voltage in the system, which requires some dedicated circuitry
- The mosfet driver can push more current than the microcontroller, which will switch the mosfets faster, reducing switching losses.
The last point is what I think to be the most important and the one I understood the least. When I thought of switching losses, I assumed they meant the energy it takes to charge and discharge the MOSFET gate capacitance. Instead, it's actually the fact that while you are switching the MOSFET, the gate will be at a low voltage that is intermediate between "on" and "off", where the MOSFET allows a large current but takes a large voltage to do so, resulting in large power dissipation. In effect, either the voltage or current (depending on if you're switching it on or off) will turn on before the other one turns off, so for an amount of time dependent on the gate current, the MOSFET is draining power.
In the breadboard picture you can see me using some "High-Low side independent" drivers, which seemed to work pretty well, though at a slightly higher price than I would have wanted.
This is a relatively simple part of the circuit to build, but I spent an inordinate amount of time picking them out, just because there are so many darned options. I wanted to not have to heatsink the mosfets, so I was looking for some with low Rdson; when I picked out the parts I was also looking for ones that could be driven by 3.3V inputs, though for the part I picked that reduced the maximum gate voltage they were rated to handle, which I was exceeding, so I wouldn't recommend it.
The main issue I ran into was that the ATmega kept on behaving very erratically and it took me a while to pin down all the reasons why. The main things I had to do were
- Connect ground better throughout the system. I'm not really sure how this was possible, but I was seeing 100mV swings between the two ground pins on the ATmega. I had wired up the ground on each side to the ground rails on their respective sides, but I had only connected the ground rails at the bottom of the breadboard setup. I connected the ground rails directly over the ATmega, which fixed this issue.
- I connected more filter caps everywhere.
- I had to be very careful to not connect high-voltage to the ATmega pins, especially when set to output mode. I blew out two ATmegas entirely which was actually easy to debug despite being disappointing, but the high voltage at times had more insiduous effects, causing the ATmega to skip around the program wildly and reset the timers randomly.
Once I had the breadboard all set up and the electrical parts working, the main challenge was the firmware. Getting the motor to jerk around randomly was quite easy, but getting it to move smoothly, especially at high speed, was quite challenging and I'm still not done.
There are two ways to control brushless motors: you can use Hall-effect sensors to directly measure the movement of the motor, or you can measure the back-emf generated by the unused motor phase, called sensorless control. It sounds pretty daunting, but the electrical part of it is fairly straightforward: you measure the voltage on the unused phase (ie the one you're not powering or pulling to ground), and use that as the input to your firmware.
I had a lot of issues with this because the ATmega's ADC is either slow or inaccurate, at least in the context of this application. An ADC conversion takes 13 ADC clock cycles, and by the default the Arduino libraries will set the ADC prescaler to 128, giving an ADC conversion time of 100us. By comparison, my motor should be capable of driving at 12k rpm with no load. From some simple empirical testing, it looks like my motor requires 42 commutation steps to do a complete revolution, which means that at a full speed of 200 rotations/s, a commutation needs to happen every 120us. Clearly, a 100us ADC time makes this impossible.
I changed the ADC prescaler to 4 (setting it to 2 made the ADC not work), meaning that the ADC conversion should theoretically take about 4us. I think there was a fair bit of overhead, since I was only doing about 8 ADC measurements per commutation cycle, at significantly less than the maximum of 12k rpm. I wanted to increase the number of back-emf measurements I could do; some references suggest that you only need to do three or so, but I wanted to shoot for closer to 10 so that I could get better resolution and potentially do some debugging.
So, the method I ended up going with was using some LM358 op amps as voltage comparators, since the control algorithm only requires a threshold detector, rather than fine-grained voltage measurement over the entire range of voltages. I was pretty surprised though: this didn't increase the controller speed at all! I think the issue was from using the Arduino's digitalRead() function, which turns out to be quite expensive (maybe 10s of microseconds). I switched to directly reading from the corresponding PIN registers, and suddenly the system could drive the unloaded motor at 11k rpm -- not quite at the max, but at this point I was maxing out the current my little wall adapter could provide. It's quite an experience to have something on my desk rotating 180 times a second! With the propeller attached, it was able to drive the motor at about 60 rotations/s, though that was limited again by the dc adapter current. Side note -- I'm planning on adapting a spare computer ATX supply I have as a simple bench supply, since it has both 12V and 5V rails and a large current capacity.
You can find the firmware here, though I would discourage using it or copying anything from it.
The next step is to get PCBs made, which can be lighter and higher efficiency (and cooler, both thermally and socially). I made a few changes from the breadboarded version:
- I used 3-input AND gates, since you can get 4 of them in the same number of chips that I was using to get 4 2-input gates. I used the extra pin as an "ENABLE" signal, which is currently just pulled high with a 10k resistor, but it gives me the option of using it as a "hard stop" signal that bypasses the microcontroller
- I changed from using 3 "high-low" drivers to a single 3-phase driver. This should be both cheaper and more functional than my previous strategy.
- I changed MOSFETs I'm using. I went with these MOSFETs from NXP -- they seemed to offer the best combination of low-cost, low-size, high-efficiency, and large number of pin-compatible options for future replacement. But again, there are so many options that it's hard to know how good a choice this is. I picked their 2mΩ variant, which should hopefully be efficient enough to drive up to 20A without requiring crazy heatsinking.
- Added a 5W current-sensing resistor. The 5W rating seems excessive, especially since it's a 3mΩ part, but 20A is a lot and forced me to revise my perceptions of what a "power electronics system" looks like.
Here's the board I came up with, which measures 1.65" x 2.15" ($17.50 for three at OSH Park). All the components, and the majority of the routing, are on the top layer, since I wanted to keep the ground plane on the backside as pristine as possible, since I'm nervous about interference with the microcontroller, though I'm hoping that the current-sense resistor, which I put on the low side, hides some of that:
Eagle files here (same disclaimer about how you shouldn't trust them).
You can see that I put a couple dividing silkscreen lines to visually separate the different stages: the top is the power stage (high-voltage high-current), the middle high-voltage, and the bottom low-voltage.
- Power stage: schematically very simple, but tricky to route due to the large current requirements. The large number of vias in this stage are mostly for thermal purposes, though I suppose they should also improve conductivity.
- High-voltage stage: this consists of the motor driver, analog multiplexer, and analog comparator (plus associated passives), which are all equipped to interface with the 12-18V supply.
- Digital / low-voltage stage. This stage consists mostly of the microcontroller, its parts, and the two AND chips, all of which needs to be protected from the high voltage. This stage is far more pin-dense than the other two, leading to routing congestion, which I should have foreseen before placing the ICs and starting routing, at which point I didn't want to rip up my work.
I've never tried assembling something with this many components on it, especially at this density, so I'm hoping that it's going to be assemble-able. I'm a little unhappy with the number of components I needed, particular the 5 non-microcontroller ICs on the board. I think with more careful use of the ATmega pins I could probably do away with the AND gates and the mux, but for this version I purposefully left some pins unused so that I could repurpose them on the fly if needed. I don't think it'll come to this, but another option is I can use a larger ATmega part with more IOs, though that costs a few more dollars.
On the topic of price, I don't feel like I've hit the price target I wanted with this board -- an off the shelf 20A ESC costs about $11, so I was hoping to hit around $15 per board for mine; instead, my total cost came out to about $23 for the first board. In decreasing order of amount, the main contributors are:
- $5.85 for the circuit board ($17.50 for 3). My current design doesn't quite fit into the 5cm x 5cm seeedstudio $10 size, but assuming I can get it to fit that, this will go down to $1.
- $5.45 for 6x 2mΩ MOSFETs. Once I build the other boards I'll get better bulk pricing ($4.90), and if I can use lower-efficiency 3.4mΩ MOSFETs (or lower), it will go down to $3.75.
$2.75 for the ATmega328. I have a feeling that I can use a lower-flash ATmega part, such as the ATmega88 for $2.13, or maybe I'll bulk-order 25 ATmega328's for all my projects, and bring the cost down to $1.75 per.
- $2.25 for 6x 3A flyback diodes. I'm not sure what can be done about this, but maybe the RMS current will be low enough that I can use smaller diodes?
- $1.85 for the mosfet driver. I think I could build a cheap replacement for this, albeit without its non-essential functionality, though the main tricky part I foresee is building the high-side driver circuitry.
- $1.08 for the 2x AND gate chips; hopefully I can get rid of these entirely.
Anyway, these are all ideas for the next round, and for now I'll just wait for the boards to arrive and work on other projects in the meantime: it's time to learn about switched power supplies, so I can power all this 5V circuitry from the 12V RC battery pack without a needing a massive heatsink for my linear regulator.
I've blogged in the past about my Nexys 3, though I haven't used it very much lately (other than leaving it in bitcoin-mining-mode, where it's earned me about ten cents in the past week).
I was browsing the Digilent website for some an ARM-based Raspberry-Pi equivalent (I already forget why), and I checked out their new products page and saw that they've just released the Nexys 4. I think I'll get one of these eventually, since looking at the product page there are a lot of improvements to areas I was hoping to improve:
- The biggest change is an upgrade to Xilinx's new 7-series fpga, the Artix-7. There are some weird economics around the Artix-7, which I've been meaning to blog about, but the key point is that the XC7A100T-CS324 part they include -- I assume the full part number is XC7A100T-1CSG324C -- starts at around $130 on Digikey, which makes the Nexys 4 look like a pretty good deal (for comparison, the Spartan 6 LX15 on the Nexys 3 starts around $28). This Artix part is quite big, weighing in at 100k cells -- Xilinx originally planned on offering smaller sizes, but currently there are only the 100k and 200k variants. 100k cells is about 7 times the capacity of the Nexys 3 board; the 7-series includes a process and architectural upgrade as well, which presumably give power and speed improvements in addition to the capacity increase.
- Less groundbreaking, but still nice, is that the peripherals are improved. I'm probably the most excited about the cheapest ones they added: they increased the number of slide switches and leds from 8 to 16, and put two 4-digit seven segment displays on the board. There are a bunch of other cool things like an audio jack, accelerometer, and temperature sensor as well; you can see the full list on their product page.
One thing to keep in mind is that the Xilinx software is quite expensive, and at least for my purposes I'd like to stay with chips their WebPack license supports; it took me a while to find, but here's the doc explaining compatibility. For the Spartan 6 line, the WebPack license only goes up to the LX75, keeping the largest few chips reserved for paid usage. For Artix, presumably because it's their low-cost chip and they only offer two variants, both the 100T and the 200T versions are supported in WebPack, offering quite a bit larger fpga capacities in Xilinx's non-hobbyist software tier.
So overall I'm very excited about the upgrades they made and it definitely looks like the Nexys 4 is much better than its predecessor, though personally I feel like I'm more at the point that I'd rather learn how to design my own FPGA board than pay another $300 for another dev board
In my Component Area Costs post, I did some back-of-the-envelope calculations to show that, perhaps unsurprisingly, PCB fabrication costs can easily dominate component costs for single- or small-quantity PCB orders. My reaction was to start minimizing the areas of all my PCBs; I think this is generally a good thing, but I wanted to test the limits of this. I'm not sure what limits how small a board can be, so I've produced a series of boards that are the same circuit, but in steadily-decreasing board sizes. Here's the board layout for my latest one, which I just received today:
And here's a picture comparing the board to my previous two iterations; they cost $21.50 for 3 (OSH Park), $9.90 for 10 (Seeedstudio), and $3.55 for 3 (OSH Park) respectively.
Assembling the board
This board comes in at 1.1" x 0.65", about twice the size of my thumbnail. When I was designing the board, my main concern was just how packed the layout looks. Once I received the board, I was actually pretty surprised how much space it felt like there was; the main problems I had were 1) handling a board this small, and 2) soldering the new TSSOP components. I used some masking tape to hold the board in place: I found that what worked best was to put a small loop of tape on the table, and press the board onto it in whichever orientation I wanted. I started with the bottom, since it has the harder components (a 14-TSSOP); I tried to solder each pin individually, but pretty quickly failed and resorted to the "put too much solder and then use desoldering braid" technique. I remember seeing somewhere that this method is "not recommended due to thermal considerations", but it worked great for me and everything seems fine. Here's what I was able to do; I need to get a stronger magnifying glass to take better pictures at this scale:
There's a ceramic capacitor in addition to the five tantalum ones -- I wanted to experiment with putting an 0805 component on a 1206 footprint (works fine but not optimal), and to see if I can use 10uF ceramic caps that are cheaper, smaller, and don't have polarity. On the bottom-left of this photo, you can see that I also tested OSH Park's silkscreen resolution; I printed text at 30, 24, 18, 12, and 6 mils. The 12- and 6-mil prints are unreadable (should have seen that coming), whereas the 30- and 24-mil ones are definitely readable. The 18-mil text is barely readable, and the '8' looks too much like the '0', so I'm going to stick with at least 24-mil silkscreen.
I can't really see how good the joints on the TSSOP parts are, but doing some simple multimeter-based testing seems to show that things are connected fine. So I went on to the top side of the board:
It doesn't look very pretty, but the connections seem to be fine and it held up to the multimeter testing.
I plugged it in, and quickly realized that there was a problem: I had messed up the layout. The problem was very small but still unrecoverable: I had swapped two of the LEDs, so that the top and bottom "D5" LEDs were reversed. I had even helpfully named all the parts in my Eagle schematic to avoid issues like this, but I had turned off the "tNames" layer since it was cluttering things up, which I guess left the door open for this mistake. Anyway, this board was more for testing my assembly ability, so I wasn't to concerned; plus it gave me a good excuse to create a new rev.
Pushing it farther
At this point I was pretty happy with the miniaturization, though I still had a nagging feeling that there was room for more. I don't know how to tell where the limit is without having something fail, so I decided to do one final layout that is as small as I can design it (in a reasonable amount of time). As opposed to the above boards, which took maybe 30 minutes to lay out then autoroute, this next version took several hours of manual placement and routing. Going down to SMD and taking out "extras" like board edge padding and duplicate headers got me a lot of space reduction with relatively little effort, but at this point the wins would have to come just from optimizing the layout and routing.
The resulting board feels much less pleasing aesthetically, since I had to break symmetry to get past a certain optimization point. For this kind of project I don't care too much about that, but I'm worried that the irregular nature will make it more tedious to solder as well. Here's what I came up with:
Schematically, there are only minor, non-functional changes; I rearranged some pins and the order of some series components to improve the routing situation. I put the components closer to the edge of the board and decreased the via pad size from a buffer of 35 mils to the OSH Park minimum of 27 mils, and I placed components using a fine 1-mil grid. The main improvements, I think though, are from manually placing and routing all the components, greatly reducing the overall routing cost -- I think it looks fairly clear that the total trace length is much lower in this version, and there are also far fewer vias. I'm happy again with the results: this new board is 0.85" x 0.48", or about 56% of the area of the previous version (which makes it about the size of my thumbnail, I guess), for a total OSH Park cost of $2.00.
There's a couple more things I've thought about doing if I wanted to decrease the size even more -- first of all, I could start using 0805 capacitors, or even redesign the circuit to use smaller capacitors that are (affordably) available in 0402 packages. I could rearrange the schematic so that all the resistors are connected to Vcc, which means I could replace the 12 discrete resistors with one or two resistor arrays. I think some of these footprints are somewhat generous -- I'm not sure if I need all that space around the already-large 1206 capacitors, for instance.
I'm going to save all these new ideas for after I test out this version, since I'm already not sure if its manufacturable.
Component area costs
Using the costs I came up with in my previous post, this board has a minimum cost of:
- 8-pin header: $0.40
- 14-TSSOP: $0.14
- 6x 1206: $0.21
- 26x 0603: $0.31
- 2x 6-TSSOP: $0.05 (just guessing)
Which comes out to $1.11, compared to the $2.00 the board actually came out to. Sometimes I try to estimate the cost of a potential circuit board I want to make, and it seems like taking the minimum cost from my previous post and multiplying by 2-3x (a number that probably varies heavily with types of components and circuits) could be a reasonable first-order approximation. Or, it might make more sense to add one or two cents of "routing cost" per pin.
Anyway, I'm not really sure how it makes sense for OSH Park to take $2.00 orders (they must do it for non-financial reasons), but as long as they want them I'll send them in :)