With the upcoming challenge and the integration of Fast-DC for PPS-DIV I was curious about FFT sizes, so I ran a few tests and came up with these. First one is a plot of FFT size (in K) by k and n, but that doesn't tell the whole story as some k values have more candidates than others, so I looked at the total number of tasks remaining for each k and generated the lower table showing the percentage of tests at each FFT by n to the end of PPS-DIV. This is all FMA3 with a Ryzen 3rd gen, so other CPUs may vary, but it should be in the right ballpark.
Up to n=8.7M can be done with one core on Ryzen3 CPU :)
____________ 92*10^1439761-1 NEAR-REPDIGIT PRIME :) :) :)
4 * 650^498101-1 CRUS PRIME
314187728^131072+1 GENERALIZED FERMAT
Proud member of team Aggie The Pew. Go Aggie!
We are now at n=6,485,697 for PPS-DIV. This is less than 100,000 away from where k=49 transitions to a 400k FFT length (n=6,585,098). At the present rate, we will certainly reach that before the challenge :)
Nice! This will have significant implications for anyone running CPUs with 1.5MB cache/core. Going from 384K to 400K will likely mean that CPUs like the i5-8400/i5-9400, etc will have the highest throughput with 2 x 3-thread tasks, rather than 3 x 2-thread which has been best recently. The older generation i5 CPUs (like my i5-3570) will probably do best with 1 x 4-thread task now rather than 2 x 2-thread tasks, but the i5-10400 will probably still be best with 3 x 2-thread tasks.
The 384-400K transition is small, but significant for cache sizes, as 1.5MB/core exists on a lot of CPUs. Happy benchmarking everybody!