Summer has been busy and it’s now behind us. I won’t annoy you with all the details of what happened but I wanted to come back on a project I started a few months ago with my kids. I called this project CLEGO.

CLEGO stands for Cluster for Learning and Exploration of Global Observations, or simply, Cluster made out of LEGOs.

Ubuntu 16 running on one of CLEGO's node.
Ubuntu 16 running on one of CLEGO’s node.

The idea was to build a silent, compact, and very powerful cluster to study Apache Spark in a dedicated, non-cloud, environment.

I initially thought I would 3D-printer the parts, but quickly realized I would be crazy (or crazier) if I wanted to design all the parts then printed them. It made sense to go with LEGO, specially as I was not on my coup d’essai, I built an entire data center in LEGO back in 2006. If you do not want to see the whole video, the data center part starts around 1:41 – and, yeah, be tolerant: this video is more than 10 years old and I did not become a famous YouTuber thanks to it.

Picking up the Right Hardware

It starts with the CPU. No way I could go with AMD, I also did not want Xeon for their price tag and consequence on the rest of the hardware (ECC memory, larger mainboard…), so I settled on used i5-6400T. They have 4 cores and 4 threads, run at 2.20GHz, but mostly their TDP (Thermal Design Power) is only 35W. The non-T version (i5-6400) dissipates 65W and only runs a little faster. Those are the CPU you would find in laptops.

Now that I had the CPU, I need the motherboard. I looked quite a bit to find a board that would have 2 network connectors or 10Gbits/s but I settled for the Asus B150M-A/M.2. The form factor (Micro ATX) was a good reason, I still wanted something compact. But 4 DD4 DIMM slots and USB 3 made the cut. If I could not get faster or more Ethernet on the board, I could still plug in a USB3 adapter.

Memory was a no brainer. I wanted DD4, max out each node to 64GB, so I picked the Corsair Vengeance kit: LPX 32GB (2x16GB) DDR4 DRAM at 2133MHz (PC4 17000). As much as it was a no brainer, it was the most expensive line of budget in this project. But hey! My cluster has 256GB of RAM, that’s the size of my SSD drive on work laptop.

CLEGO uses NVMe drives.
CLEGO uses NVMe drives.
NVMe drives look pretty much like a memory module.
NVMe drives look pretty much like a memory module.

As the level of noise was a real priority, I did not want mechanical drives for storage. I was at first thinking about classic 2.5″ drives, but settled for m2 “drives”. After a little bit of research, I decided on Intel’s SSD6 series, which are basically NVMe drives. Those drives have pretty awesome specs: sequential read up to 1775MB/s, sequential write (up to) 560MB/s, and 128.5KIOPS. I picked 512GB, if I need more, I can still access my 12TB SAN. 

Despite have low wattage processor, I still felt like having a dynamic airflow on the CPU. This is done by a Cooler Master GeminII M4 CPU Cooler. They have 4 direct contact heat pipes to drain those calories out. However, the most interesting aspect for me was the variable speed fan and the very, very low sound level. I must admit I really like the great Cooler Master packaging and engineering for the metal parts and thermal paste. It’s definitely a great product.

I thought I could work with only two power supplies on the rather high wattage. I went with a couple of SeaSonic SS-520FL2. They are fan less and display a 520W output and modular cabling: in my scenario I need a PSU to feed 2 motherboards and 2 CPU. Therefore I did not want additional cables to use valuable room. The unit comes with enough cable for one motherboard and two CPU, so I had to get an Eyeboot ATX 24-Pin Female to 24-Pin Male Y splitter power cable.

Network is assured by a TOR (Top of Rack) switch, which consists of a basic, but efficient, Netgear GS108Ev3 8-Port gigabit managed switch. Originally, I wanted 4 ports for the 4 nodes, 1 port for output, and 1 port for a Raspberry Pi to monitor the environment (temperature, etc). But I left this last idea out for now, meaning that a 5 ports could have been enough.

Speaking of ancillary equipment, I also included a 4-port KVM from Iogear, GCS24U. It’s a little bulky (specially with all the cables) but I like the idea of a “remote” switch (a cable) so you do not have to have the switch next to the cables.

The thermostat of the AC Infinity Airplate T7, displaying the temperature of the system and the desired temperature.
The thermostat of the AC Infinity Airplate T7, displaying the temperature of the system and the desired temperature.

Despite the low heat of the CPU and the fans, I wanted to make sure that the hot air would not stay on the motherboards. The power supplies are also fan less, but we still need to dissipate those calories. I added the AC Infinity Airplate T7 to the system. It provides a really quiet cooling with a thermostat control, which allow for precision temperature control. This is usually used for home theater AV cabinets.

Finally, last, but not least, I needed a few cables: LED, power and reset switches, as well as speakers.

We are now ready for assembly.

Building the Chassis

I wanted to keep the footprint of the system and the equivalent of two green baseplates, so 32 by 64 studs.
I wanted to keep the footprint of the system and the equivalent of two green baseplates, so 32 by 64 studs. You can see the connectors in the front and the first power supply in the back.
A top view.
A top view. At this stage of the project, I was still measuring and aligning elements. The CPU is not in place, neither is the memory. The Cooler Master fans is not attached.
A larger view, including my dining table, which was a great asset for this project!
A larger view, including my dining table, which was a great asset for this project!
A more distinct view of the power supply and the power area, which I wanted white...
A more distinct view of the power supply and the power area, which I wanted white…
Another view of the power supply.
Another view of the power supply. I originally wanted only 3 windows. The Final project has 4 windows on each side.
Starting the 2nd node.
Starting the 2nd node.
The first node is ready,
The first node is ready,
Working on the switch.
Working on the switch.
First attempt for the switch.
First attempt for the switch.
It works!
It works!
Make sure the cables are loose but not the switch.
Make sure the cables are loose but not the switch.
The case, focusing on the reset button.
The case, focusing on the reset button. Sorry for the bad brick here, it probably got lost.
Zoom on the switch itself.
Zoom on the switch itself.
CLEGO node one off...
CLEGO node one off…
CLEGO node one on...
CLEGO node one on…
Door to the connectors.
Door to the connectors.
Second node working.
Second node working.
View through the windows...
View through the windows…
Two first nodes running independently on the same power supply.
Two first nodes running independently on the same power supply.
Fan is working too...
Fan is working too…
A background view on the Asus BIOS.
A background view on the Asus BIOS.
Even if I minimized cables (e.g. by using NVMe M2 instead of SATA SSD), it's still a mess.
Even if I minimized cables (e.g. by using NVMe M2 instead of SATA SSD), it’s still a mess. I took care of it.
Front cabling starts to be messy with 2, how is it going to be with 4?
Front cabling starts to be messy with 2, how is it going to be with 4?
The switches and LEDs are mounted in a small container that slides in the chassis: if a switch or LED fail, they can easily be changed.
The switches and LEDs are mounted in a small container that slides in the chassis: if a switch or LED fail, they can easily be changed.
Taylorism in action.
Taylorism in action.
More Taylorism in action.
More Taylorism in action.
Each node (except the first) is in a cage, the same idea as the LEGO houses.
Each node (except the first) is in a cage, the same idea as the LEGO Creator Expert houses (each floor is separate).
Zoom on the power supply.
Zoom on the power supply.
Cables around a brick.
Cables around a brick.
The second power supply in place, as well as the external fans.
The second power supply in place, as well as the external fans.
The external fans, slightly inset in the chassis. Note the full space to easily add the cages for node 3 and 4.
The external fans, slightly inset in the chassis. Note the full space to easily add the cages for node 3 and 4.
Inside the cooling tower.
Inside the cooling tower.
The space between node 2 or node 3. Or node 1 and 2... Or...
The space between node 2 or node 3. Or node 1 and 2… Or…
Cage 3 and 4 in preparation. Ignore the Kragle. No Kragle was used in this project!
Cage 3 and 4 in preparation. Ignore the Kragle. No Kragle was used in this project!
Same cages, different angle.
Same cages, different angle.
CLEGO with 2 nodes.
CLEGO with 2 nodes.
Getting an idea of the final volume.
Getting an idea of the final volume.
The tower of death... At this point, I was really thinking of adding Darth Vador and Luke Skywalker...
The tower of death… At this point, I was really thinking of adding Darth Vador and Luke Skywalker…
Preparing the Top of Rack Switch. You can also see the lock to close the front panel.
Preparing the Top of Rack Switch. You can also see the lock to close the front panel.
Integrating the thermostat.
Integrating the thermostat.
The external fans with the 4 nodes.
The external fans with the 4 nodes.
First attempt at the cover, this is mot what I finally picked.
First attempt at the cover, this is mot what I finally picked.
The 4 nodes waiting for the cover to come in place.
The 4 nodes waiting for the cover to come in place.
Cables are messy!
Cables are messy!
The new platform is ready to get the common elements.
The new platform is ready to get the common elements.
Integrating the switch and KVM.
Integrating the switch and KVM.
A first attempt at place the external network cable and the KVM remote.
A first attempt at place the external network cable and the KVM remote.
The final top of the rack placement with inbound and outbound data cables, KVM cables, as well as transparent window to access the KVM display and the temperature of the thermostat.
The final top of the rack placement with inbound and outbound data cables, KVM cables, as well as transparent window to access the KVM display and the temperature of the thermostat.
Focus on the thermostat's information, cover closed.
Focus on the thermostat’s information, cover closed.
Focus on the thermostat's information, cover open.
Focus on the thermostat’s information, cover open.

"<yoastmark

"<yoastmark

"<yoastmark

Final assembly, cover closed.
Final assembly, cover closed.
Final assembly, cover open.
Final assembly, cover open.
CLEGO can be transported, but it is not an exercise I love doing!
CLEGO can be transported, but it is not an exercise I love doing!
Installing Ubuntu...
Installing Ubuntu…
The final logo.
The final logo.
The final logo.
The final logo.

Final Dimensions & Thoughts

"<yoastmark

CLEGO measures about 25.4cm (10″) x 50.8cm (20″) x 42.2cm (16.6″). It now rests on a monitor stand, so I can have it under my desk, instead of “on” my desk. It is “pretty” heavy. I will have to weigh it sometimes, but moving CLEGO is always a challenge.

It was definitely an interesting project, but now, the more difficult tasks are coming: install and optimize Apache Spark to leverage this power. Look forward to more articles on this, hopefully soon!

2 thoughts on “A New Dimension for Apache Spark Clusters

Leave a Reply