Hello! Today I will be discussing considerations of CPUs in machine learning.
This was actually a little bit difficult to put together since there are lots of use cases in the world of computing. A CPU that is great for gaming might not necessarily be optimal for machine learning. There are different factors that need to be considered. Even within machine learning, there are lots of different tasks that get performed that I will go into.
My target audience with this are people looking to building their own workstations so that they can do things like Kaggle competitions without paying a fortune to Amazon or another cloud services. This is for people who want a personal workstation to do relatively smaller machine learning workloads without putting things on the cloud.
The reason I need to specify who my audience is so clearly is that the spectrum of machine learning is quite wide. The needs for a company will be vastly different due to data volume and available budget. Typically, in those cases you will want to go to the cloud or have an on-premise data processing center. Considerations that Tesla has to think about, might not apply at all to this smaller scale. Scale makes such a huge difference with everything related to machine learning. Ok, now that that is out of the way and I’ve made my disclaimer about who this is targeted for – let’s get into it.
I’ll cover the different components of CPUs and which of them are important for machine learning at the personal scale. Rather than tell you exactly what CPU to buy, I want to provide as much information as possible so that you can make your own decision based on the CPU qualities, your use-case, and your budget. If you want specific advice for your exact use case, please reach out to me I would be more than happy to help offer my 2 cents!
OK, so let’s get into a few of the components of CPUs. The traits I will discuss are cores, threads, and PCIe lanes.
First let start with cores. Everyone understands cores pretty well – more cores allow more parallelization of work. This is closely tied together with threading where there are two threads per physical CPU core. Think of threads as a virtual concept not a physical one. Threads digitally split a single physical CPU core into two virtual cores. AMD and Intel both have techniques for doing this split. The number of hardware threads can tell us how many software threads can be running on the CPU – how many tasks your computer can be running at any given time.
So, cores and threads are pretty well tied together. One that gets talked about quite a bit with machine learning are PCIe Lanes. PCIe lanes transfer data from CPU RAM to GPU RAM.
There are two main compute-intensive tasks that the CPU will be involved in: preprocessing and model training.
Let’s talk about preprocessing first. No matter your use case, data preparation and reading in data initially are all done on the CPU.
If you’re doing bulk preprocessing, speed of your processing cores is going to be very important. Also, number of cores are going to be very important if you are able to spread that processing out across them. Python has some nice libraries that enable you to do just this.
The priority here will be clock speed followed by number of cores.
Sometimes, however, you don’t do bulk processing you are doing something called mini-batch processing where you are preprocessing items in an asynchronous manner and trying not to do this all at once. This is usually applied in deep learning models.
The priority here will be number of cores followed by clock speed.
The other compute-intensive task is the actual model training itself. The question here is are you planning to be training deep neural networks or not? If not, you are likely going to be doing bulk preprocessing followed by training a model on your CPU.
This follows the same priority of clock speed, # cores.
If you are planning to train deep neural networks, you will need lots of cores to be sending that information to your GPU.
This follows the same priority of # cores, clock speed.
You will notice that PCIe lanes didn’t get talked about much here. The reason for that is due to the scale of the machine learning that we are dealing with here. Tim Dettmers confirms this with the findings put forth on his blog, there is essentially no performance difference with the number of PCIe lanes, he mentions just make sure that your motherboard can comfortably support the number of GPUs that you plan on having in your final system. Where PCIe lanes do end up playing a role is in much larger scale problems. At Tesla, Andrej Karpathy needs to be concerned about PCIe lanes due to the scale of the data that he is working with at Tesla. To give some context, he mentions that his networks learn from the most complicated and diverse scenarios in the world, iteratively sourced from our fleet of nearly 1M vehicles in real time. A full build of Autopilot neural networks involves 48 networks that take 70,000 GPU hours to train. This is definitely not a one-person project that a single workstation build can tackle. Therefore, PCIe lanes get left out of our equation.
So where does this leave us? Funny enough we can pretty much generalize everything covered here into one question: Do you plan on training deep learning models?
If yes, prioritize cores over clock speed. For this situation, consider an AMD processor since they are crushing it with the multi-core systems lately.
I recommend the AMD Ryzen 9 3900X 12-core, 24-thread. This is going to give you lots of cores at a great value. You can check it out on Amazon here.
If you’re not going to be doing lots of deep learning, prioritize clock speed over the number of cores. For this situation, consider an Intel processor. There are fewer cores which means less parallelization but the ones that they do have are typically blazing fast
I recommend the Intel Core i9 i9-9900K Octa-core which will give you 8 cores with an overclock speed up to 5.0ghz. You can check it out on Amazon here.
Links to those in the description as well. Both of these recommendations are not exactly cheap but unfortunately this is not a cheap hobby that we have picked. If you want specific advice for your exact use case and budget, please reach out to me I would be more than happy to help and offer my 2 cents!
I hope I was able to clearly explain the concepts behind why certain CPUs are beneficial in certain use cases so that you can make your own informed decisions when looking to buy a CPU that will tackle your machine learning workloads. Hoping to make this a series of covering the considerations of different components and what should be looked for if you’re purchasing them for machine learning workstations. Let me know what type of component you would like to see next!
Thanks so much, have a good day. BYE!
https://www.ai-buzz.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising