Why is MATLAB so fast in matrix multiplication?

I am making some benchmarks with CUDA, C++, C#, Java, and using MATLAB for verification and matrix generation. When I perform matrix multiplication with MATLAB, 2048×2048 and even bigger matrices are almost instantly multiplied. 1024×1024 2048×2048 4096×4096 ——— ——— ——— CUDA C (ms) 43.11 391.05 3407.99 C++ (ms) 6137.10 64369.29 551390.93 C# (ms) 10509.00 300684.00 … Read more