Due to RAM supply issues, Hardkernel canceled RK3399 based ODROID-N1 board last year, and decided to replace it with ODROID-N2 using a “newer SoC .. with faster CPU/GPU cores and native DDR4 support”, but they did not provide any details about the processor, and we speculated it could be the upcoming Amlogic S922X processor.
Hardkernel has now formally unveiled ODROID-N2, the first Amlogic S922X SBC to be announced, with 2 to 4GB DDR4 RAM, 4x USB 3.0 ports, Gigabit Ethernet, HDMI 2.0a video output up to 4K 60p and more.
ODROID-N2 SBC specifications:
- SoC – Amlogic S922X hexa-core big.LITTLE processor with 4x Arm Cortex A73 cores @ up to 1.8 GHz, 2x Arm Cortex A53 cores @ 1.9 GHz, Arm Mali-G52 GPU @ 846MHz; 12nm manufacturing process
- System Memory – 2GB or 4GB DDR4 RAM @ 1320 MHz
- Storage – 8MB SPI flash, eMMC flash module socket, micro SD card slot
- Video & Audio Output – HDMI up to 4K @ 75 Hz, AV port (composite video + stereo audio)
- USB – 4x USB 3.0 ports, 1x micro USB 2.0 OTG port
- Expansions – 40-pin GPIO header with 2x I2C, UART, 6x PWM, SPI, S/PDIF, 2x ADC, and GPIOs
- Misc – 2x system LEDs, SPI/eMMC boot select switch, IR receiver, 2-pin header for RTC battery, 2-pin header for optional fan
- Debugging – 1x UART header for serial console
- Power Supply – DC power barrel jack
- Power consumption – Idle: 1.6~1.8 Watt; Heavy load: 5.2~5.3 Watt (stress-ng –cpu 6 –cpu-method matrixprod)
- Dimensions – 90 x 90 mm (TBC)
The processor is placed on the bottom of the board, and a large heatsink covering the complete bottom side of the SBC ensures proper cooling.
The company will provide Ubuntu 18.04 LTS image with Linux kernel 4.9, as well as Android 9.0 Pie image and BSP for the board. Hardware accelerated video decoding is working in Ubuntu including 4K/UHD H.265 60fps, but Mali G52 GPU Linux driver only works on the framebuffer, as Arm has no plan to support X11 on Bitfrost GPU. A Linux Wayland driver will be released in a few months. The Wiki is already up, but still work in progress at this time.
We had already seen some Amlogic S922X CPU benchmarks (Geekbench), but Hardkernel ran more benchmarks involving GPU, USB, and storage, and performance is quite better than the one of ODROID-N1 with multi-core performance around 20% faster, the DDR4 RAM 35% faster, the Mali-G52MP6 around 10% faster, around 340MB/s USB 3.0 throughput, over 900 Mbps Gigabit Ethernet Tx and Rx, and more.
The full details of the benchmarks can be found in the announcement linked in the introduction of this article. They also tested the board under load (with stress-ng) in a chamber set to 35°C, and the CPU temperature never exceeded 74°C.
The ODROID-N2 SBC will sell for $63 with 2GB RAM and $79 with 4GB RAM. I can already see some of you throwing bank notes at the screen shouting “just take my money”, but you have to calm down because the board will only start shipping in April, and Hardkernel plans to start selling ODROID-N2 at the end of March.
Thanks to T and Johannes for the tip.
Jean-Luc started CNX Software in 2010 as a part-time endeavor, before quitting his job as a software engineering manager, and starting to write daily news, and reviews full time later in 2011.
Support CNX Software! Donate via cryptocurrencies, become a Patron on Patreon, or purchase goods on Amazon or Aliexpress
DAC and SPI boot are one of the best new features. N2 has multimedia written all over it 😀 Since we do know when H2 will be available again, I will wait until both are available and place one fat order. The fact that we can choose between 2 and 4 GB of RAM only shows that HardKernel has listened to us, the users. These are excellent news.
16$ difference between 2GB and 4GB DDR4 ram.
Lower supported regions on this globe should be preferred (at this N2 price tag)?
900Mbps+ over Ethernet isn’t hard, even the old H3 can do that, if you know how to tweak the drivers.
Interesting choice to go 4x big and 2x LITTLE, not what I expected. Looks like a solid piece of kit though.
Here’s hoping for good driver support, as that’ll be key.
Ah, we finally have an announcement for this board!
Good to see the CPU is slightly more powerful, but it is no replacement for the N1 in terms of storage. The N1 has two SATA ports and this has none. It has four USB 3 ports, but they are supplied through a hub, so really only one port in terms of bandwidth.
Oh well, it’s off to the RockPro64 for making a mini-NAS.
that’s one of the design decision i didn’t like about S922X leaks we had so far
they muxed usb3 and pcie… so you have to choose either one or the other
I hope that someone (pine64 people?) will release s922x boards with full-size PCIe instead of USB3.0.
Afaik there is only one lane pcie so you wouldn’t have any gains, just different interface with not so many devices to attach..,
> Afaik there is only one lane pcie so you wouldn’t have any gains, just different interface …
What about USB vs PCIe latencies?
It’s not just latencies but protocol support. If the use case is ‘NAS’ or storage for example you can even get better sequential speeds with an USB3 attached SATA controller like JMS578 compared to a single Gen2 lane PCIe attached SATA controller like ASM1061 (choosing this since used on ODROID N1).
But random IO performance will be a lot higher with the PCIe solution compared to USB3 while at the same time CPU utilization is lower (way less IRQs to be processed). And also less hassles with protocol layers since with a PCIe attached SATA controller you have direct access to the disks while with USB3 there are always problems (affecting spindown behavior or reading out SMART attributes and so on).
> It’s not just latencies but protocol support.
I meant total latencies, including protocol processing overheads.
I will be using a H2 to replace my XU4 as a NAS.
Nice board. Pity that shipping costs for odroid boards are typically fairly high (I seem te recall last time I checked it was in the $20-30 range)
I feel you, “Shipping fee: $28.00”. This is way to much for a board that size/weight
Hardkernel used to offer cheaper shipping, but too many packages got lost (I think around 5%, but not 100% sure), so now they use more reliable shipping options, and it’s a bit expensive.
If only it did not stop by the customs each time for me.
The rule of my customs seems to be take the declared price a proceed with a fee of half the price.
So I could expect a minimum $100 (without postage fee) in total for this SBC.
FYI, Hardkernal has a number of distributors in various countries (USA, Canada, France, UK, Russia, etc) which dramatically lowers shipping costs.
You can find their distributors page listed under the ‘Support’ tab on their website.
And what is Cortex -M4 for? Will it be possible to use it?
I have not seen the info specific to S922X, but S905X2 and S905Y2 have an ultra-low power always-on MCU for wake on voice support as reported @ https://www.cnx-software.com/2018/10/21/comparison-s905x-s905x2-s905x2-processors/
So I suspect that’s what the Cortex-M4 core is for on Amlogic S922X as well.
It’ll probably run an ATF (ARM Trusted Firmware) which will handle DVFS (Dynamic Voltage/Frequency Scaling) tasks. At least that’s what it does on the other AmLogic chips used in HK boards.
They have been putting 2 MCUs for the G12 generation. One for ATF/DVFS and one for voice processing/other uses. The M3 is for the ATF/DVFS.
looks tasty, i wonder whether the SoC supports 8GB ram.
Arm are being idiots not working to bring GPU support to Linux at a affordable price, in my opinion arm are shooting themselves in the foot !
reverse engineering of the mali gpu is well underway, open source panfrost gpu drivers will be ready in a few months for OpenGL so that won’t be a problem for long. No Vulkan or OpenCL open source drivers have been planned though although that would be way easier once the OpenGL drivers are done.
Cheers 🙂
I thought they were doing GLES2 ATM, perhaps 3.x in the near future.
Yeah, GLES 3 is a lot of work that is not happening soon without a sponsor. GLES 2 is manageable. GL is through mesa software since the hardware doesn’t officially support it. ARM really needs to be involved or Vulkan and CL are not going anywhere. It took ARM themselves 2 years to get CL to mediocre shape on Android.
Does any of the experts here know if the Odroid XU4 was already so fast on release, or got continuously tweaked? The relative performance does not look so much better for the N2 compared to the relatively old XU4. Can we expect some more performance gains due to improved drivers in the future?
However power consumption seems to be better, with 5W max and stable thermals with a passive cooler, quite an improvement. Shut up and take my money indeed 🙂
Better to wait till they are use in the wild, unless your job desperately needs it.
Folk got excited at Allwinner H6, then found the Pcie crippled and broken.
> The full details of the benchmarks can be found in the announcement
Where? Wrt CPU performance I found no details at all (storage and network look fine though)
I mean I did not post all benchmark charts from the announcement, so more info can be found in the announcement thread. Not only for CPU, but the other benchmarked sub-systems as well.
You can get a rough idea of what to expect already since the CPU and GPU are similar to Amlogic A311D SoC.
I am just not sue what a 6EE Mali G52 is ?
presumably 6EE = 6 executable units a.k.a Mali G52 MP6
> more info can be found in the announcement thread. Not only for CPU
There’s nothing ‘for CPU’. Only some funny graphs visualizing random data.
Using Unixbench in 2019 especially when comparing different platforms is pathetic (and Hardkernel should know this): http://www.brendangregg.com/blog/2014-05-02/compilers-love-messing-with-benchmarks.html
It’s a compiler benchmark and not a hardware benchmark. Using sysbench shows that you have no interest in serious CPU benchmarking at all. The comical failures when looking at sysbench numbers should be well known in the meantime. It’s a compiler benchmark as well. What do they compare here? RK3399 running a year ago on their canceled N1 with sysbench built with GCC 6.3 (Debian Stretch) vs. sysbench now built with GCC 7.3 (Ubuntu Bionic)? What do these numbers tell if they differ?
What is the thermal throttling test for? ‘stress-ng’ just generates some load but shows no performance numbers. If they would’ve used cpuminer and showed both thermals and khash/s rates that would provide at least some information.
@tkaiser I don’t get the last part.
If the frequency stays the same, why would performance degrade? The stress test shows that under load, there’s no throttling even with the passive cooling. Eg for xu4, under load throttling would lower frequency from 2ghz, and would dance around 1.8-1.7 Given the performance X for N2, under stress you have still X. Is khash a benchmark for you?
> Is khash a benchmark for you?
Not at all. But it provides information about ‘performance’ especially on those platforms where we can not trust in cpufreq information provided by sysfs (known as the ‘Amlogic and RPi dilemma’ ).Those khash/s values when graphed show whether performance degrades over time or not. Showing sysfs values representing ‘the clockspeed the cpufreq scaling driver thinks the CPU cores would run with’ is 100% useless to get an idea about whether thermal throttling happens or not.
The ‘Sysbench score vs CPU frequency’ graph clearly shows that we can not trust into what the cpufreq scaling driver thinks would happen or how do you interpret the curve between 400 MHz and 700 MHz? If this scales not absolutely linear we know there’s something wrong and the real clockspeeds differ from those set in Linux.
BTW: sysbench is crap as ‘general purpose CPU benchmark’ as everybody should know in the meantime. But due to its internal limitations (calculating prime numbers completely inside the CPU without any interaction with external DRAM) when used correctly it can be useful, e.g. for such stuff as getting an idea whether cpufreq scaling is flawed or not. Sysbench numbers have to scale linearly with both cpufreq and count of CPU cores.
the thermal graph shows thermal characteristic not cpu performance 🙂
> the thermal graph shows thermal characteristic not cpu performance
The thermal graph shows one sensor’s temperature trying to settle at 73°C which is a value not existing in DT: https://github.com/hardkernel/linux/blob/odroidn2-4.9.y/arch/arm64/boot/dts/amlogic/mesong12b.dtsi
The frequency graphs show always the same: 1800 MHz and 1900 MHz.
The SoC vendor is known for doing all the thermal and DVFS stuff so far not within Linux but as part of a proprietary firmware that gets loaded into an Cortex M core living in the SoC. The same SoC vendor is also known for cheating with cpufreq in its whole 64-bit lineup so far.
So how would a test look like to show no thermal throttling happens? Running just load generators like stress-ng that show no performance indication and displaying sysfs values that might be completely irrelevant?
You know, fool me once, shame on you, fool me twice, shame on me. HK already was fooled once, I doubt they will be willing to have another c2-2ghz debacle
> I doubt they will be willing to have another c2-2ghz debacle
That’s the most irritating part, yes.
I mean you have an N1, right? If so are you able to run
openssl speed sha256
? If so do you get the ~770000 score Hardkernel uses to demonstrate that N2 with ~860000 would be faster? Or do you get ~990000 as I do?What is the presentation of
mbw
memory scores for? Hardkernel knows exactly that the memory performance when measured 1 year ago on N1 was the result of using RK’s 4.4 kernel with inappropriate settings. I posted almost immediately back then results with mainline kernel showing way better scores. And in the meantime we can achieve the same even with RK’s 4.4 kernel on RK3399, see https://forum.khadas.com/t/painlessly-usable-linux-distro/3124/24?u=tkaiser (having that in mind… nope, N2 allows not for faster memory access compared to N1 if the latter would run with today’s software).What’s the rather meaningless ‘thermal throttling test’ with the suspicious temperature curve for? The ‘Amlogic issue’ is well known (not being able to trust in cpufreq numbers available to the kernel and an embedded firmware controlling DVFS and temperatures) so why didn’t they address it already?
Why did they chose to present laughable sysbench scores? They know since almost 3 years (C2 launch) that sysbench executes 15 times faster when built for ARMv8 compared to ARMv7. Why sysbench scores then? To show at least one area where N2 looks significantly faster than good old XU4?
> It’s a compiler benchmark and not a hardware benchmark.
> …
> It’s a compiler benchmark as well.
Most of the benchmarks are hardware AND compiler benchmarks. Look to the SPEC2006/2017 scores for evidences (GCC vs ICC for example).
But mostly I agree with your point. HK have chosen outdated and unsuitable CPU benchmarks.
> HK have chosen outdated and unsuitable CPU benchmarks.
I really hope it’s by accident and not by intention. Just checked sysbench with Ubuntu 18.04. Default package version is 1.0.11 which spits out totally different numbers than 0.4.12 (which was default on Ubuntu 16.04 and Debian Stretch and where we’ve seen performance differences of up to 50% on same hardware simply due to different compiler versions and flags)
Interesting to look back
https://www.cnx-software.com/2018/08/03/amlogic-a311d-cortex-a73-a53-processor-mali-g52-gpu/
And 51333 gives the bench marks too
https://browser.geekbench.com/v4/cpu/9397860
1100/2800 isn’t a ridiculous performance for a 4 core A73 @1,8GHz? RK3399 with only 2 core A72 showes same results…
> 1100/2800 isn’t a ridiculous performance for a 4 core A73 @1,8GHz?
Better ignore the BS. You find some early Geekbench numbers for S922X (not for A311D) already here https://browser.geekbench.com/v4/cpu/search?utf8=✓&q=galilei but they’re made with an Android not bringing up the CPU cores in aarch64 state.
And all of this is irrelevant now since with the N2 we could run benchmarks that show the real performance potential of S922X, it just hasn’t happened yet due to HK having chosen strange ‘benchmarks’ for their announcement.
> https://browser.geekbench.com/v4/cpu/search?utf8=✓&q=galilei but they’re made with an Android not bringing up the CPU cores in aarch64 state.
Oops, at least the results from ‘Feb 07, 2019’ are with CPU cores in aarch64 state but still missing support for ARMv8 Crypto Extensions. So for people looking only at total scores they need to be prepared that scores will further improve since as HK showed S922X support fast crypto. Multi-Core Score will most probably climb up to 4000 then.
Still rather useless numbers since Geekbench execution times are that low that potential throttling effects won’t be taken into account.
> Oops, at least the results from ‘Feb 07, 2019’ are with CPU cores in aarch64 state
These results are 32-bit, not 64-bit.
I was talking about the CPU cores / kernel and not the userland. ‘ARM implementer 65 architecture 8’ means ARMv8 so cores are in 64-bit mode while the result from ‘Jan 31, 2019’ showed the cores being brought up in ARMv7 mode (architecture 7 — Geekbench simply parses /proc/cpuinfo and assembles the info to a weird string)
Ha OK. Anyway the result pages all say Android 32-bit and ARMv7, so the benchmark was run as a 32-bit app and this is what matters when comparing results.
As far as AES goes, cryptography extensions were only enabled for 32-bit starting with version 4.1: https://www.geekbench.com/blog/2017/03/geekbench-41/
This proves that this is running as a 32-bit app 😉
We like kernel compile time (numbers), once in shared memory (possible with top 4GB?) and once from storage disk. The (mainline) kernel is a stable instance for comparability, not a detailed view, but an absolute number for improving hardware with real world usage.
(Floating point numbers are double precision 64bit or quadruple precision 128bit in benchmarking tests?)
> This proves that this is running as a 32-bit app
As already written few posts above: Multi-Core Score will most probably climb up to 4000. Based on what we currently know I expect S922X being ~25% faster with rather irrelevant multi-core scores than RK3399 both running at reasonable upper clockspeeds (that’s 2.0/1.5Ghz for RK3399 and whatever clockspeeds Amlogic’s BLOB allows the cores in S922X to run at)
> So for people looking only at total scores they need to be prepared that scores will further improve since as HK showed S922X support fast crypto. Multi-Core Score will most probably climb up to 4000 then
Confirmed: https://browser.geekbench.com/v4/cpu/12114546
Lower single-threaded performance than RK3399 but approx. 25% faster multi-core scores.
temperature chart in the original post is “interesting” :-). Temperature in degrees Celcius.
scale is 50000 to 85000
I hope this does not reflect the quality of the metrics themselves.
Very interesting board and design overall, as usual with HK. However I have some strong doubts about the temperature curve, especially with the long history of lies behind Amlogic. The temperature climbs very fast and suddenly becomes “flat” once it reaches 74 degrees. By “flat” I mean it’s oscillating very quickly between 73 and 74. For me this is a proof of throttling with a 1-degree hysteresis. Normal temperature rise should continue to climb over time, but slower as the temperature is higher. It’s indeed very likely that the micro-controller adjusts the frequency in real time behind the curtains without any way for it to be reported in sysfs.
I think that a short-term test (hashing, compression or whatever) before the temperature reaches the limit, and the same test over a longer period reaching the temperature limit would indicate how much throttling was in effect. For example you could have 1k H/s during the 10 first seconds, and 07kH/s over one minute.
Thus at this point I really suspect that the 1.8 GHz frequency is only attained during the beginning of the test. But I could be totally wrong.
sbc-bench exists for a reason. To answer all these questions in shortest time possible (relying on your cool
mhz
utility to spot cpufreq cheating)All that’s needed is to link
/etc/armbianmonitor/datasources/soctemp
to the sysfs node for the CPU temperature and fire upsbc-bench -c
. All questions about real clockspeeds and thermal throttling reliably answered and a rough performance estimate available.When they announced the H2 they got in touch with me, gave me remote root access to a few H2 prior to launch to adopt sbc-bench and to generate benchmark numbers. This did not happen this time and it’s not even needed since
sbc-bench
should run out of the box.Then probably they have been careful and have run their tests. They have been the first real victims of Amlogic’s cheating and have dealt with it in a very responsible way, so I would trust them to deeply test their products nowadays to never get caught again.
> Then probably they have been careful and have run their tests
I really hope they did not run any real tests yet and the choice of published numbers has been done by a trainee and not someone knowledgeable.
Given that current Hardkernel lineup consists of canceled, outdated, unavailable and underwhelming (alphabetical order) I guess they had to choose between canceled and underwhelming here. Which makes kinda sense given low software development efforts with N2 and C3 based on same platform.
Well, to their defense they offer long term support on their hardware and even upgrade the software. So they don’t really *need* to make hardware based on the most recent designs, just something attractive enough to start selling. But their ability to maintain the software and to benefit from mainline is critical to cut costs over the long term.
What about video encoding on the VPU? I thought S922X was supposed to do encode streams up to 4K H.265/H.264…
> video encoding on the VPU
Only curious (since there’s no CSI input on this board)… What’s your use case?
Low power NAS with transcoding. But in any case I wouldn’t use this board…
Dimensions – 90 x 90 mm is really toooo big
I prefer small boards
btw it looks like ODROID C3 is on the way
https://github.com/hardkernel/linux/commit/959c23c7b5a466e43283bf0259b535013a3b913c
And it seems to be based on Amlogic g12a instead of the g12b on the N2. I wonder what are the differences between the two SoCs…
G12a should be Amlogic S905X2 according to https://github.com/torvalds/linux/commit/9c8c52f7cb4f3b604bb9836947dadfbb255f465f
Ah yes, I forgot about it! So it seems ODROID-C3 will be based on S905X2…
Nice catch.
this morning i checked the repo and found C3 related info too (based on another commit, the tested SoC should be S905D2)
i DMed odroid and the answer i got was that this C3 was abandoned last year
edit:
in fact, this commit message also shows it’s s905d2
> this C3 was abandoned last year
So maybe better looking at that C3?
https://github.com/hardkernel/linux/commits/959c23c7b5a466e43283bf0259b535013a3b913c/arch/arm64/boot/dts/amlogic/meson64_odroidc3.dts
The OPP there are interesting. Up to 2 GHz under 1.0V, and 1.2 GHz below 0.8V. I doubt this can be achieved at 28nm, probably it’s much smaller (14nm?). Maybe one day we’ll be surprised to see Amlogic making good products 🙂
> The OPP there are interesting
Why? Do you believe Amlogic now does DVFS from within Linux and not their proprietary firmware running on one of the Cortex-M?
My comment was directed at the timestamps. If there is a C3 that ‘was abandoned last year’ then what is this for a C3 that got 4 DT adjustments this year?
Maybe precisely because they could sort out the truth from the lies and figured they could finally make a reasonable product out of what remains of it ?
They are still using the prototypes in house, so they still update the kernel.
Weird. Up to now I thought the only little piece of real information wrt CPU performance published by Hardkernel would be that ARMv8 Crypto Extensions in S922X (A73) perform better than RK3399 (A72). They show a funny graph with XU4 being at 240,000, N1 at 770,000 and N2 at 860,000.
Conclusion: A73 at 1.8GHz (if we believe into this number) +10% faster here than A72 at 2.0 GHz. Then I realized that they used ‘openssl speed sha256’ so not comparable to the countless already collected results using
aes-256-cbc
instead. Why choosing a different method here instead of one that can be easily compared to other SoCs/boards?Anyway, let’s try to figure out what these numbers mean, checking RK3399 myself on both an A53 core at 1.5 GHz and an A72 at 2.0 GHz: The test produces this output:
sha256 110970.90k 309824.49k 642760.87k 878455.81k 979105.11k 990292.65k
Where is the
770000
number Hardkernel is using to show that N2 is superior compared to N1? Ok, maybe they did this test one year ago on the N1 back then running Debian Stretch:sha256 110068.59k 303448.26k 633643.52k 874154.67k 981093.03k 987622.06k
Full results: https://pastebin.com/raw/Xmnw3Jbn
What are those for numbers Hardkernel uses in their graphs?
We (various forum members) will have samples in a few days and can test actual performance Thomas.
I am sure the tests Hardkernel uses aren’t to hide performance or skew results, it’s just the same tests they always use.
> it’s just the same tests they always use
Care to look at the N1 announcement yourself? https://forum.odroid.com/viewtopic.php?f=149&t=29932
Except of Unixbench (which was an insane choice for a ‘benchmark’ in 2018 already) and mbw (with Hardkernel using an averaged score making the results rather useless) everything else was different but back then even rather useless benchmarks like Unixbench/NBench could be used for comparisons (performance of the A53 cores for example).
With N2 right now there’s nothing wrt CPU performance which is a bit… irritating.
I should have looked more closely at the previous tests.
Since this SoC has the same cores but a tweaked 6EE GPU, as the Amlogic A311D and bench marks for the A311D are already posted on CNX-Software. People already have a idea of this new SoC’s performance, without all these guesses!
In Summary for S922X:
Good:
4xA73
Low power
Low heat
Low cost
MP6 GPU
DDR4
Bad:
No DP
No PCI-E
USB 3.0 Hub
The lack of PCI-E really screws up SSD use :'(
> In Summary for S922X:
> …
> No PCI-E
You meant ODROID-N2, but not Amlogic’s S922X, right?
Thanks for the correction. PCI-E x1 can’t really be used with NVMe SSDs. It’s slower than SATA so it’s only useful for WiFi.
> PCI-E x1 can’t really be used with NVMe SSDs. It’s slower than SATA
Huh? That’s focussing on rather irrelevant sequential transfer speeds, right?
SATA means using AHCI as protocol and NVMe is always the better choice. So even if it’s just a single Gen2 lane and even if synthetic benchmarks show higher sequential transfer speeds I would always prefer NVMe due to being way more efficient, less CPU utilization, bidirectional so better performance with mixed workloads and so on.
Unfortunately Hardkernel avoided PCIe here (which is somewhat understandable after their PCIe misconception with ODROID N1) so we can’t get an idea how S922X behaves (could be the same sh*t show as with Allwinner H6)
As stated above, pcie is pin muxed with usb3.
Between the 2, usb3 is an easy choice.
I prefer one lane of PCIe 2.0 instead of one USB3.0 bus.
> As stated above, pcie is pin muxed with usb3
Yeah, PCIe would’ve made everything a lot more expensive.
Also it can be used for hard disk drives (via PCIe-SATA cards).
there is something i’m not clear with the GPU MP number :
is it something ‘official’ to count the number of execution engines rather than shader cores with bifrost gpu ?
Intel does this, too, for their integrated GPUs. NVIDIA as well, if they speak eg of “Cuda cores”, I think that’s the equivalent of an execution unit here, or?
ARM will really take over intel when it start selling upgradable computers starting with 4GB of RAM & the have more than 1 sata/m2 connection + option to upgrade RAM. I was waiting for an ARM board with 4GB of RAM. The waiting is over!
You must be new here 🙂 Welcome
Some Arm boards or even full computer with 4GB RAM or more:
RK3399 + 4GB RAM – https://www.cnx-software.com/2019/01/15/orange-pi-rk3399-4gb-ram/
24-core Arm processor + 4GB RAM (but can be upgraded to 64GB) – https://www.cnx-software.com/2018/03/21/edge-server-synquacer-e-series-24-core-arm-pc-is-now-available-for-1250-with-4gb-ram-1tb-hdd-geforce-gt-710-video-card/
Hisilicon Kirin 960 with 4GB RAM – https://www.cnx-software.com/2018/01/13/hikey-960-android-development-board-gets-a-4gb-ram-version-for-250/
RK3399 with 4GB ($95) – https://www.cnx-software.com/2018/08/24/nanopi-m4-raspberry-pi-rk3399-board/
Marvell based headless board with one SO-DIMM slot for up to 16GB RAM – https://www.cnx-software.com/2016/10/11/solidrun-macchiatobin-is-another-marvell-armada-8040-networking-mini-itx-board/
And the list goes on and on.
But this ODROID-N2 may be one the most cost effective Arm boards with 4GB RAM.
Their choice of benchmarks aside, HK appear to have the propensity of falling for Amlogic’s more, erm, weird parts. For instance, here we have a setup of 4x CA73 + 2x CA53 (so far so good) that runs at 1.8GHz and 1.9GHz, respectively. Yes, there will be single-threaded workloads that will run faster on the LITTLE cores, just wait and see.
Ermm, no 🙂
A bike has bigger wheels than a car, but it will never go faster :))
Some bikes will go faster on _some_ routes than some cars ; )
While the CA73 uarch is a novelty to me, the CA53 is not, and CA53 has been known to outperform CA72/CA57 _per_clock_ on the right kind of workload. Add to that the prospect that big cores will be prone to thermal constraints much more frequently, and.. you want to bet? : )
I agree, I’ve witnessed this already on some instructions. I think it was the integer divide which took 5 cycles on A53 vs 6 on A72, but this would need to be rechecked. And of course everything which runs in-order at one instruction per cycle (e.g. crypto) will run faster on the higher-clocked A53 here. But to be honest, the difference will be very small (5% peak) so it’s unlikely anyone will notice, and a few percent of out-of-order code might suffice to offset this.
S922X’s OoO cores can run a lot faster than 1.8GHz in reality. It is memory starved by the 32-bit DDR interface so 6 cores will just burn power at higher frequencies waiting on memory. Need faster DDR4 or LPDDR4. That’s where RK3399 shines with its 64-bit interface. 32-bit DDR4 at 2666MT(1333MHz) is about 35% slower than 64-bit DDR3 at 1600MT(800MHz).
> That’s where RK3399 shines with its 64-bit interface
But haven’t you seen Hardkernel’s mbw ‘benchmark’ that clearly shows that N1 (RK3399) is a lot slower than S922X when it comes to memory performance?!?!
Isn’t that HK’s own decision? From the SoC diagram above, the chips supports 4x16bit DDR4L (incl) interfaces, no?
Not sure HK has anything to do with this
when you look at the pictures in HK forums, you see there are 4 memory chips used in this Odroid N2 board : 2 at the top side & 2 at the bottom side of the PCB
chances are high that they use of the whole 4 x 16 bits interface
Most probably Da Xue still believes into the Amlogic marketing department claim ’12nm manufacturing process’ and searches for excuses why reality doesn’t match the claim (searching for an explanation for those low clockspeeds used with S922X)?
There are plenty s905x2 boxes on sale now, is there no one, who can confirm the true manufacture method, 12nm or not ?
A few weeks ago, there was this article about Samsung Exynos 7904
https://www.cnx-software.com/2019/01/21/samsung-exynos-7904-processor-triple-camera/
Samsung Exynos 7904 specifications:
CPU – Dual-core Cortex-A73 up to 1.8 GHz, and hexa-core Cortex-A53 up to 1.6 GHz
Process – 14nm FinFET Process
As you can see, this SoC frequencies are on par with S922X.
According to your reasoning, should we also believe this 14nm process is a lie from samsung marketing department ?
I don’t.
> According to your reasoning, should we also believe this 14nm process is a lie from samsung marketing department ?
No, why would we? Samsung is a reputable vendor owning modern fabs and able to produce in-house the SoCs they designed before. They do mobile SoCs where battery life is important so moving to a more expensive process to get decent performance at low consumption/temperatures is a logical consequence. Their stuff ends up on Anandtech&Co and experts analyze what’s going on. Cheating would be spotted immediately.
None of this applies to a random manufacturer of cheap TV box SoCs like Amlogic.
BTW: I honestly don’t care about the fab node any of these SoCs are made with since all I’m interested in is how CPU performance looks like at which consumption level and whether thermal throttling becomes an issue or not.
All the little information we have right now (I hope this will change soon) doesn’t look that promising. Lower single-threaded performance compared to an old 28nm design like RK3399 running at 2.0GHz and multi-threaded performance only ~25% better than such a RK3399 which is a bit surprising given there are 4 big cores on S922X.
I wouldn’t be surprised if A73 cores are limited to 1.7GHz with single-threaded loads and clockspeeds even decrease when 2 or more cores are active. With access to an N2 it would be a matter of minutes to get an idea what’s really going on.
> All the little information we have right now (I hope this will change soon) doesn’t look that promising. Lower single-threaded performance compared to an old 28nm design like RK3399 running at 2.0GHz and multi-threaded performance only ~25% better than such a RK3399 which is a bit surprising given there are 4 big cores on S922X.
What S922x Geekbench result are you comparing to? If you are comparing with the ones found in PrimateLabs DB then the comparison is hard to make as these are 32-bit results.
You can find A72 vs A73 results on SPEC 2006 here: https://www.anandtech.com/show/12195/hisilicon-kirin-970-power-performance-overview/2
SPEC 2000 here: https://www.anandtech.com/show/11088/hisilicon-kirin-960-performance-and-power/2
Kirin 955 has A72 up to 2.5 Ghz while Kirin 960 has A73 up to 2.4 GHz. And both on a TSMC 16nm process.
As you say it remains to be seen what frequency the S922x really runs at. But the quoted frequency of 1.8 GHz is quite disappointing.
> As you say it remains to be seen what frequency the S922x really runs at
I asked Hardkernel folks whether they tested with Willy’s
mhz
tool already. They didn’t so far but will do and provide results next week.> … Amlogic marketing department claim ’12nm manufacturing process’ …
Wikichip:
—–
An enhanced version of TSMC’s 16nm process was introduced in late 2016 called “12nm”.
…
In late 2016 TSMC announced a “12nm” process (e.g. 12FFC) which uses the similar design rules as the 16nm node but a tighter metal pitch, providing a slight density improvement.
—–
https://en.wikichip.org/wiki/16_nm_lithography_process
It is definitely from a “12nm” fab. I’m not sure why your doubting this. The device runs at low voltages but can scale to 2.2GHz+ with higher voltages (with ATF that allows this). Dev boards have been sampling a few months. Single core performance can easily outpace RK3399 at the higher frequencies but multicore performance is severely memory bandwidth limited. At the stock frequency, it uses only half the power of RK3399.
> but can scale to 2.2GHz+ with higher voltages (with ATF that allows this)
What you refer to as ATF (ARM Trusted Firmware) is the firmware the Cortex-M3 runs with in reality, right? The ATF construct is just there to ensure the firmware is not modified? And this firmware controls thermals and DVFS, right?
And given this refreshing piece of information https://forum.odroid.com/viewtopic.php?f=176&t=33781&sid=21f10d22b4029342febb4abc3f1754a3&start=50#p246118 it doesn’t seem Amlogic allows to adjust DVFS settings on the N2 (higher voltages to allow for higher clockspeeds).
While @willy is looking at DVFS OPP in some DT files it looks it’s the same as always with Amlogic: this stuff can’t be controlled from within Linux but happens solely in the closed source domain in the contained Cortex-M3.
BTW: Since it seems Amlogic or Hardkernel chose to use different sysfs nodes than the rest of the world I tried to adopt sbc-bench for ODROID N2 in the meantime.
Yes the signed ATF “chain”. Given that you can burn out SoCs, ex. Pine64/OrangePi/etc, the thermal/DVFS locking in ATF makes sense if you were in Amlogic’s shoes. It’s a fail safe in case both the cooling and code are crap or non-existent. All one needs to do is look at Android benchmarking to see where custom DVFS can go wrong or become unrealistic.
> it doesn’t seem Amlogic allows to adjust DVFS settings on the N2 (higher voltages to allow for higher clockspeeds)
Since Amlogic didn’t “lie” about CPU frequencies this time around, I don’t see any reason for them to give HK ATF access. BTW, HK locks down their software if they detect an non-HK board. Pretty douche move especially their images are nothing special and just another OpenLinux implementation.
Not sure why you had to adapt sbc-bench.
/sys/devices/system/cpu/cpufreq/policyX is pretty standard.
On my XU4:
$ cat /sys/devices/system/cpu/cpufreq/policy0/scaling_available_frequencies
200000 300000 400000 500000 600000 700000 800000 900000 1000000 1100000 1200000 1300000 1400000
$ cat /sys/devices/system/cpu/cpufreq/policy4/scaling_available_frequencies
200000 300000 400000 500000 600000 700000 800000 900000 1000000 1100000 1200000 1300000 1400000 1500000 1600000 1700000 1800000 1900000 2000000
Ok, let’s assume S922X is on “12nm” process and can scale to 2.2GHz.
There’s still something I don’t catch. Why did HK (or Amlogic) chose to limit big cores at 1.8GHz and lower voltages? The kind of devices it’s designed for shouldn’t be power constrained…
Maybe because performance would be limited in any case by the memory bandwidth?
> Maybe because performance would be limited in any case by the memory bandwidth?
BTW: all the ‘CPU benchmarks’ HK is using to show how faster N2 should be compared to N1 do not rely on memory bandwidth at all since they all run inside the CPU’s caches (that’s one of the reasons choosing them is so weird/misleading)
Interesting, in the meantime HK published 7-zip numbers: https://forum.odroid.com/viewtopic.php?f=176&t=33781&p=245876#p245876
7-zip is not only about CPU performance but also about memory latency (not bandwidth). And it’s a bit strange to notice that HK last year ran RK3399 on N1 with the A72 cores at 2 GHz while they now write they set the A72 to just 1.8GHz. Anyway, the graphs look more like A72@2GHz which still poses the question how on earth S922X with 4 A73 cores and also A53 at higher clockspeeds can only be 20% faster compared to RK3399?
Looking at detailed memory performance, as well as single-threaded scores and thermal behavior would be interesting which is (not so) surprisingly the stuff sbc-bench is doing 🙂
Do I recall correctly that the RK3399 came from work between Rockchip and Intel and at launch was a high premium SoC. Where as I suggest the Amlogic s922 is a middle premium SoC, based on SoC cost alone.
Also the low voltage design suggests aimed at passive cooled small devices. TV box and TV with built in voice controls etc.
The GPU should have a Neural Processing Unit too, going off arm spec.
Also the low voltage is a design result of the method used for this 12nm process used.
There’s an update/ps to that:
‘P/S, We will check the memory performance more carefully in the next week because RK3399 slow memory issue seems to be solved with a couple of patches in the Kernel.
We’ve not played with the N1 Kernel since we dropped it several months ago.
I guess the memory bandwidth difference may be negligible not 35% if we apply the patches.’
Chances are all N1 results in those charts are with the wrong kernel.
> There’s an update/ps to that
Most probably a simple reaction to me linking to https://forum.khadas.com/t/painlessly-usable-linux-distro/3124/24?u=tkaiser here in the comments 18 hours ago.
While ayufan figured out how to dramatically increase memory bandwidth on RK3399 with Rockchip’s 4.4 kernel (so that they’re on par with mainline kernel!) this shouldn’t affect any of HK’s benchmark choices since they only use either benchmarks that are not affected by memory access at all (the insane Unixbench stuff and sysbench) or depend more on memory latency (7-zip).
If a benchmark is broken by design (Unixbench in this century or the pathetic sysbench stuff) you can’t fix these benchmark by kernel patches. Even if numbers change they’re still numbers without meaning.
> I guess the memory bandwidth difference may be negligible not 35% if we apply the patches.
This read ‘I guess the memory bandwidth difference may be 10% not 35% if we apply the patches’ two hours ago. Good to see it’s a dynamic process happening here 🙂
> chances are high that they use of the whole 4 x 16 bits interface
Hardkernel also talks about 32bit DRAM data bus width (better forget about the hdparm scores there since made with a tool which tests with insufficiently small 128KB block size — but this is the good news: USB3 storage performance with S922X is a lot better than reported yet since Hardkernel’s published numbers were made with UAS disabled)
“Arm has no plan to support X11 on Bitfrost GPU. A Linux Wayland driver will be released in a few months”
May one therefore conclude that Arm now regards X11 as “dead/obsolete/kicked the bucket/shuffled off its mortal coil/rung down the curtain” and that the Wayland era has not just begun but is now the de facto mainstream?
ARM cares where they invest engineering time to get paying customers. In 99% of the cases, it’s Android. Samsung cared about Wayland so ARM cared about Wayland. Judging by the recent shift away by Samsung, I don’t think ARM will be up for supporting Wayland unless others verbally say they care. And by “they”, I really mean just Google. It’s not like ARM’s implementation was in any way usable for Linux since most apps still require Xwayland. Only those apps that were built on frameworks like QT or GTK have an easy migration to Wayland.
For all the enthusiasm I have for open source, you still need big backers like Samsung, Huawei, Google, Facebook, Amazon, etc. However, they’re all working on sectioning off their creations.
There is a long proven method for getting better GPU driver support, gaming and GPU competition.
Never mind retro consoles, more games and gaming on Android TV boxes is the market to drive, GPU driver support. ( sadly Google, Linux and arm, show no interest in working together towards a unified Linux, Android display ) .
So till we get a affordable TV box with pcie, you can plug a Nvidia or AMD graphics card in, things will move slow.
The market does not need better than Xbox or PlayStation, as Nintendo showed, it just requires quality enjoyable games at a affordable price. In my opinion.
ChromeOS does wayland, so google will very likely be interested in arm’s continual wayland support.
S922X has a Cortex-M4 controller unit. How did developer team and firmware for this M4 change?
“The SoC vendor is known for doing all the thermal and DVFS stuff so far not within Linux but as part of a proprietary firmware that gets loaded into an Cortex M core living in the SoC. The same SoC vendor is also known for cheating with cpufreq in its whole 64-bit lineup so far.”
Thx, that’s interesting therefore, but not all truth for all people that contributed inside that SoC vendor.
Why Hardkernel sbc*s get that much attention?
> all people that contributed inside that SoC vendor.
Huh? Do you realize that we’re talking about Amlogic and not HK (or Libre Computer or Khadas)? Do you have any insights into this company or know people working there?
Insight into development, state of and people connected with Cortex-M4 firmware would be helpful for understanding why and what for this firmware prevents cpu related transparent&direct monitoring. AFAIK HK team is not involved with Cortex-M4 parameter setting.
MCU changed to next generation, so this question arose.
Are Amlogic people around here also?
Ok. Power Management Processor is second Cortex-M3. Understand now, thanks.
> Why Hardkernel sbc*s get that much attention?
Simply because HK is one of the few companies around who’s known for investing a lot of time to polish their products and sell them in good shape, and assure the maintenance for a long time. They do care about quality. It doesn’t mean it always works perfectly, but at least they honestly try. There are 3 or 4 such companies often getting the same attention for the same reasons, then around you have the usual amount of untested crap from various vendors that gets mostly sold as set-top-boxes.
Next generation set-top-boxes’ performance will be sufficient for desktop replacement with light tasks, also. So competition will be getting tougher and price differences are getting less of an advantage.
Next generation A76 is comparable to Intel Kaby Lake level cpu’s. A73 is half of that ‘laptop-class’ performance of A76.
https://en.wikipedia.org/wiki/Comparison_of_ARMv8-A_cores
Sure but the first A76 you’ll find in an STB will run at fake frequencies, with poor power management, with incorrect DRAM timings causing it to be slow as hell or to crash every 2 hours, and there will be no way to install a proper heat sink on it. When companies like Hardkernel, T-firefly, or FriendlyElec make a product out of an SoC, you get well-thought cooling, appropriate DRAM timings, some margin to explore extra DVFS operating points, documentation, schematics, etc. This makes a huge difference and can justify the price difference if you have to pay people for the R&D work (well the reverse-engineering should I say for STB hardware).
I’ve made my first build farm out of HDMI sticks. It was OK as a PoC but a disaster in terms of quality. I’ve put linux on my H96-max which was sold as an android STB. I spent countless week-ends on it and was not that much satisfied. Neo4 arrived, goodbye H96! All this to say that once you start to see STB appear, SBCs are not that far because a reference design exist and the aforementioned companies are going to make something good out of it. So there really is no reason to waste so much time on STBs.
As far as I am aware, Amlogic have a reference design STB for s905x2, Y2 and s922x
> When companies like Hardkernel, T-firefly, or FriendlyElec make a product out of an SoC, you get well-thought cooling, appropriate DRAM timings, some margin to explore extra DVFS operating points, documentation, schematics, etc.
Hear hear. It’s important to distinguish throw-away products from viable products on the SBC market. I’d add to your list of reputable vendors olimex, solid-run and vendors who use solid-run’s SOM designs like kobol (of helios fame).
Totally agree with your extra vendors list.
If these are good firms, explain to me as a normal user. How do they allow gross errors in the development of one of the fundamental systems – “cooling system” (without the correct operation of which there can be no question of high performance) ?
I started by trying to use some of the Balbes constructs in my VIM2, plagued by the same complex and contradictory instructions to install. I’ve never been able to boot. I would not waste my time. I think the problem is that it makes changes to the bootloader but can not document how to implement them properly – all of its instructions on multiboot activation are unintelligible. I think it’s a language problem in your heart.
You could waste your time working on your compilations without success!
I have to agree with you, Johnny.
All you have to do is use CoreELEC only once to realize the size of the failure and the Balbes150
Detail there is not a single person on this planet who has managed to run their compilations without problems!
Balbes150 save your critiques for yourself
For by causes of people like you LibreELEC is sinking made titanic. Best done OpenELEC.
Maybe your criticism comes poor not having received a courtesy
You don’t even know what you are talking about. Stop spreading FUD. I got both N1 and N2. N2 got an awesome cooling system. Overall imo the N2 is one of the best SBC.
I wonder if TSMC having 12/16nm faulty photoresist material problems will slow or delay SoC supply
TSMC takes $550m hit on defective photoresist material.
” TSMC said it discovered that a batch of photoresist from a chemical supplier contained a specific component which was abnormally treated, creating a foreign polymer in the photoresist that affected 12/16nm wafers at its Fab 14B.
To ensure the quality of wafers delivered to customers, TSMC said they have decided to scrap a higher number of wafers than its earlier estimate. ” ref electronicsweekly
As far as i know, the previous amlogic SoCs (S905, etc…) used Global Foundries 28nm process
So it could be that these new SoCs use Global Foundries 12nm instead of TSMC one (to be confirmed)
> As far as i know, the previous amlogic SoCs (S905, etc…) used Global Foundries 28nm process
At least (in 2014):
“Our partnership with TSMC’s 28HPC technology has further extended our leadership in 4K Ultra High Definition OTT STB, Smart TV, and tablet SoC solutions,” said John Zhong, CEO of Amlogic.
https://www.prnewswire.com/news-releases/tsmc-28hpc-process-in-volume-production-274859841.html
i was wrong
i shouldn’t have been lazy : i thought 28nm HKMG refered to a GF process code name, but it’s not.
I thought globalfounderies had stopped at 14nm canceling their 7nm ?
Tried to summarize what we already know and what’s still missing: https://github.com/ThomasKaiser/Knowledge/blob/master/articles/Quick_Preview_of_ODROID-N2.md
Now with first detailed sbc-bench report available I start to believe into S922X made in a more efficient process than RK3399. Also reported clockspeeds for both big and little cores seem to be reasonable at least with single-threaded loads. With multi-threaded scores I’m still not 100% convinced (needs rather simple tests requiring maybe 2 minutes more time as outlined below)
https://github.com/ThomasKaiser/Knowledge/blob/master/articles/Quick_Preview_of_ODROID-N2.md#sbc-bench-results
So even if it’s still early, it looks like HK has done quite a great job on this board, this is very encouraging!
Absolutely!
I guess for the majority of use cases the N2 performs very well even if not showing higher single-threaded performance compared to N1/RK3399. Lack of PCIe (or PCIe attached SATA) might be an issue for some users but hey… other SBC exist too (and HK could easily design a HC3/HC4 based on S922X which would allow to position those HC devices not just as NAS but also able to be used for virtualization/containerization with 64-bit and 4 GB RAM).
While I’m able to accept now that S922X really is made in a more advanced process I still don’t understand the limitations, e.g. A73 cores being limited to 1.8 GHz. And also what happens if users try the ‘overclocking’ settings. When @rooted tried sbc-bench with both clusters set to 2.0 GHz cpuminer failing almost immediately was one of the expected results but not why memory bandwidth is then lower compared to stock 1.8/1.9 GHz settings: http://ix.io/1BrG
Anyway: this board with its solid powering method, great passive cooling setup, sufficient performance and great software support is a good basis for a lot of use cases. I hope Hardkernel explores/provides an automated StabilityTester approach to assist their users stopping silly ‘overclocking’ attempts. 7%-8% more ‘performance’ is nothing anyone will be able to notice while the price you pay is instabilities, crashes and data corruption.
It is to do with the technology design of the 12nm nodelet process used and packaging technology.
> this is very encouraging!
Already looking forward to your build farm tests once N2 is available at ODROID bench. My expectation is 20%-30% faster compared to your NEO4’s.
I expect to test once available as well 🙂
They put four N2 on the bench (obviously with some airflow blowing over the heatsinks so throttling will not occur in any case).
Just checked sbc-bench execution inside one of the containers (not disturbed by other visitors) and results are almost the same as running bare metal: http://ix.io/1BEM
How does one get single-core numbers form p7zip bench?
ed: nevermind, I saw it.
For those happy to look at graphs visualizing random numbers generated by questionable kitchen-sink benchmarks… here’s ODROID-N1 vs. ODROID-N2 (since ODROID-N1 does not exist in the wild this is simply representative for all the other RK3399 boards around):
https://openbenchmarking.org/result/1902228-SP-1902215SP13
Some tests failed on N1, some on N2 — neither know why nor care since the Phoronix Test Suite (PTS) is such a mess anyway.
When monitoring benchmark execution I observed the following:
* on the N2 when running in a Docker on “ODROID bench” the PTS thought the OS would be a Debian 9.7 while it’s Ubuntu Bionic instead (important since default GCC versions are 6.3 vs. 7.3 which makes a huge difference with a lot of benchmarks)
* the perl interpreter test when running on N2 was sent to a little core most of the time (maybe S922X BSP kernel scheduler needs some tweaks)
* the PostgreSQL tests constantly access the underlying storage. Result variation will occur depending on type of storage (slow SD card vs. fast eMMC vs. super fast SSD for example). On a really fast SanDisk Extreme A1 I’ve seen %iowait values of up to 40% with N1. That’s just a random I/O storage benchmark not suitable to tell anything about ‘CPU performance’ of the tested ARM device itself (see RPi 3B vs. 3B+ for example)
Asides that it should be noted that the PTS uses whatever compiler version it finds on the system, uses different compiler flags for different tests (sometimes sets own flags, sometimes relying on defaults) so this is mostly a compiler benchmark only partially telling something about hardware performance.
Are there numbers how gcc 6.3 and 7.3 compare in compiling efficiency, while having (nearly) comparable settings for a defined compilation situation (only gcc version differs on equal platform/demand/date)?
We wonder about the few tools being available for counting ‘lines of code’ that was actually compiled within a onetime ‘make’ command or generally build command exec. Well better said, there is none?, AFAIK.
Thx for this update.
> Are there numbers how gcc 6.3 and 7.3 compare in compiling efficiency
Sure, for example search for
ix.io/1iFm
in https://github.com/ThomasKaiser/sbc-bench/blob/master/Results.md — 3 times exactly same OS image and hardware but I built cpuminer one time with distro’s default GCC 6.3, the line below with 7.3 and then below with 8.2:* GCC 6.3: 3.85 khash/s
* GCC 7.3: 4.40 khash/s
* GCC 8.2: 4.63 khash/s
You get an application performance boost by 20% for free just by updating your compiler. But this depends on the application in question and there are other benchmarks and tasks that are not that much or affected at all.
But in general comparing one ‘hardware’ running e.g. Debian Jessie (GCC 4.7 or 4.9) with another ‘hardware’ running e.g. Ubuntu Bionic (GCC 7.3) will show better performance for the latter just due to newer compiler version (same happens even on exactly the same hardware). The PTS doesn’t take care of this at all.
While it would be possible to test through with the PTS I lack the time for this. For whatever reasons the PTS takes ages to run benchmarks (building the benchmarks most of the times happens only on a single CPU core — it’s a mess). Each of the above benchmarks took more than 4.5 hours for a single run!
Thx. Means compile time for kernels with different gcc versions could differ up to 20%. (Fastest compiling on newest os’s is ‘only’ (compile time-20%) faster compared to (possibly) updated os’s and on older hardware? Some recommend gcc option -O1 (avoids reordering memory accesses) for general improvement towards shorter compile times.
What are Your considers about lmbench (lmbench – Utilities to benchmark UNIX systems)?
Available on Ubuntu 18.04 (Bionic), but not on Debian 9.x (Stretch) repositories, AFAIK.
These are not compile times — what @tkaiser posed were performances for the compiled binaries.
Word of advice: don’t compile with -O1 — plenty of code today is written with the assumption that the compiler will optimize code written for best readability to something that executes efficiently, and -O1 is not good for that. Stick with -O3, for robust fp code even -Ofast. If something breaks — drop to -O2. But -O1 is really useless outside of debugging. Of course, where size is essential, use -Os.
Thx,
mail to gcc developers:
Hello,
we would be interested, if there is a table comparing different gcc compiler versions, including varied mainly used options, considering their compile time for a given source code or loc (lines of code), efficiency improvement of compiling and performanc (loc/s).
Is there an option that summarizes loc that are actually compiled during one compilation event started in a Makefile for example?
How have resulting binaries been evolving in performance over different gcc versions and over this ~18years, since dev of gcc started 2001?
“Could You recommend a tool for linux os’s for counting (lines of code actually built)/second at one compile situation, instead of counting lines of code in sources (e.g. cloc)
Seems email address was wrong.
gcc-help@gcc.gnu.org
Maybe someone here could forward this request, pls?
Phoronix: GCC 7.4.0 GCC 8.2.0 GCC 9.0.0
https://openbenchmarking.org/result/1812284-SP-GCCCOMPIL26
(varying improvements, e.g. 7zip, cons. very early version of gcc 9.0.0, AFAIK)
> Means compile time for kernels with different gcc versions could differ up to 20%
As @blu already explained it’s not about compile times but about performance of the binaries created. Another example how/why settings matter: There’s a thread in Armbian forum dealing with performance comparisons between RK3399 and S922X and something almost everybody missed so far is that default kernel config from Rockchip and Amlogic differs wrt CONFIG_HZ.
By switching from CONFIG_HZ=1000 (Rockchip default) to CONFIG_HZ=250 (rest of the world) and from GCC 7.3 to 8.2 the render performance of a program called ‘Blender’ improves by almost 25% on RK3399: https://forum.armbian.com/topic/9619-announcement-odroid-n2/?do=findComment&comment=73089 — curios how numbers differ with an GCC upgrade on S922X with the A73 cores (would require one of those Hardkernel betatesters to do a quick
do-release-upgrade
)Interesting
Another Phoronix benchmark on gcc 7.3 and 8.1
kernel 4.13 slightly slower, mp3 faster (overall hardware related differences)
https://www.phoronix.com/scan.php?page=article&item=gcc-81-benchmarks&num=4
S922X would need around half an hour for compilling its own kernel.
( difference ~4-5%, from above non arm related hardware benchmark, but available comparison )
Just for the record I’ve just tested the build time with my usual test method (http://wiki.ant-computing.com/Choosing_a_processor_for_a_build_farm). The device performs extremely well. The 4 A73 cores at 1.8 GHz as fast as the the 4 A72 cores at 2.0 GHz in the MacchiatoBin which has twice the memory bandwidth. The board is 33% faster than the NanoPI-M4 at stock frequency, and 21% faster than the NanoPI-M4 overclocked at 2.0+1.8. So I must say it performs *very* well. It’s likely that I’ll buy a few in April to replace the aging MiQis at the office 😉
> The 4 A73 cores at 1.8 GHz as fast as the the 4 A72 cores at 2.0 GHz
I suspect these are the perks of not doing fp ; )
I think instead that it’s the two LD/ST AGU units compared on one LD + one ST in the A72 that makes the difference. Build time is extremely sensitive on memory latency, and while the A72 in the mcbin has almost twice the bandwidth, it’s very likely that the A73 achieves a lower overall latency under high contention. Also one point that I tend to forget is that they significantly shrunk the pipeline (11 vs 15 stages). This does have an effect on code which spends its time iterating over lists and trees and mispredicts a lot. So in the end it could very well be that the A73 is faster on this type of workloads.
Yes, indeed. But that compounds to the non-fp nature of the workload, I think — the integer portions of the pipeline being streamlined, vs the workload not doing much (any?) fp. CA72 should be faster per-clock in (simd) fp, so there the chances of CA73 catching up are much slimmer.
ODROID-N2 is available now.
https://www.hardkernel.com/shop/odroid-n2-with-4gbyte-ram/
Cool, thanks for the update!
A first look at Odroid N2 running Android 9.0 video on YouTube, published by ETA PRIME and dated 3April 2019. Shows Android 9.0 release has problems, with flickering and tearing when running games. Still needs work.
Just dropping by to confirm @willy ‘s observations: odroid-N2 is the dream integer-spaghetti-code machine.
We have a new arm per-clock leader in the brainfuck charts (^f Amlogic):
https://github.com/blu/brainstorm
And a nearly-as-good-as-CA72 at NEON/gemm performer (^f Amlogic):
https://github.com/blu/gemm
Rarely have I seen such a good machine for $80.