The state of arm64
				  Wookey, wookey@wookware.org
				
  Who is Wookey?
  
    - Debian Developer, ARM porter
      Embedded/multiarch/crossbuild/bootstrap 
    - Linaro, seconded from ARM
 
  
  Ramble through state
  
    - Supposed to be BOF-y
 
    - Built, working, optimised, missing
 
    - Hardware availability
 
    - Benchmarking, what next
 
  
  I'm a build engineer
  If it builds, it's done.
  Mostly Done
  
    - Things people care about work
 
    - Some things still need porting
 
  
  But How well do they work?
  
    - Some things may not work at all...
 
    - Popular things are optimised
 
    - Many things are not
 
    - Need feedback from real users
 
  
  Some History
  
    - Started 2010 - toolchains
 
    - 'Fast' Model 2011
 
    - Qemu 2013
 
    - Hardware March 2014
 
  
  5 bootstraps
  
    - Crossbuilt internal bootstrap 2011
 
    - Crossbuilt ubuntu bootstrap 2012
 
    - Ubuntu rebootstrap 2013/14
 
    - Debian ports rebootstrap 2014
 
    - Debian official rebootstrap 2014/15
 
  
  Now
  
    - 96% built (11205 packages)
 
    - 314 failed
 
    - 370 not for us
 
  
  Languages
  
    
    - Optimised
 
    
      - C, C++, Java (8&9), Python, Perl, Ocaml, Javascript (v8), Haskell (ghc), Lisp (SBCL) (2 months)
 
    
    - Ported
 
    
      - Lua, R, Rust, Golang, Julia (3 days), Perl6 (3 months), Javascript, Pascal (fpc) (just now!),
 
    
    - Missing
 
    
      - Mono,  libphobos (D), Luajit, ?
 
    
  
  Still missing
  
    - Mono (23)
 
    - libphobos (D-libraries) (11)
 
  
  Mono
  
    - Blocks 23 packages
 
    - Done but not released
 
    - Does anyone care enough?
 
    - Coreclr instead? (not in Debian yet)
 
  
  Free Pascal (fpc)
  
    - Blocks 22 packages
 
    - Port Done
 
    - Uploaded yesterday :-)
 
  
  Not missing
  
    - Nodejs - libv8 
 
    - Julia - uploaded 3 days ago
 
    - golang - (base support in v1.5)
 
    - Lua (but no luajit)
 
    - Ocaml (native in 4.02, 2015.07)
 
    - SBCL (2015.10)
 
    - Perl6 (2015.10)
 
  
  Optimisation
  
    - C fallback
 
    - Assembler, Intrinsics, Neon
 
  
  Assembler in packages
  
    1000 bits of assembler:
    - https://wiki.linaro.org/Platform/DevPlatform/ArmSoftwareList
 
    - http://performance.linaro.org/
 
  
  
Removing often better than 'fixing' (e.g. alsalib)
  
  What's Optimised
  
    - GCC (compilers)
 
    - LLVM (no linker) (compiler)
 
    - Java 8 & 9
 
    - openSSL
 
    - libV8 (Javascript jit)
 
    - fftw3 (FFT library) Neon support
 
    - gnu-mp (gmp) (Feb 2013)
 
    - hadoop (crc, using HWCAP)
 
    - ceph (crc, using HWCAP)
 
    - Kernel (raid6, crypto)
 
    - Xen, (loads less code on arm64)/li>
  
 
  Not Optimised
  
    - Ionmonkey (Mozilla JIT)
 
    - Golang (being worked on)
 
    - OpenCV
 
    - ffmpeg/libav (parts)
 
    - Openoffice
 
    - Lua (luajit)
 
    - R?
 
    - What else?
 
  
  Blocking Packages
  
    - mono (23), fpc (22), libphobos(11)
 
    - dietlibc (5)
 
    - insighttoolkit (3)
 
    - timblserver (2)
 
    - umview (3)
 
    - kexec-tools (2)
 
  
  Never Built
    ('Auto Not for us')
  
  
  - For other arches
 
  
    - nvidia-support
 
    - nvramtool
 
    - powerpc-utils
 
    - raxml
 
  
  - Probably should be fixed
 
  
    - openafs
 
    - rootstrap
 
    - scm
 
    - qtemu
 
    - darktable
 
    - lots more
 
  
  
   
  I have no hardware!
    Online
  
    - OVH (Runabove.com) ThunderX
 
    - OBS (Open Build System)
 
    - Debian porter boxes
 
  
  I have no hardware!
    Hardware
  
    - HP Moonshot
 
    - ARM juno
 
    - APM C1
 
    - Softiron 3000
 
    - Gigabyte MP30
 
    - Hikey (96boards)
 
    - Dragonboard 410c(96boards)
 
    - Anaconda
 
    - Pine64
 
  
  Applied Micro X-gene
  
  $1500
  Juno (ARM A57)
  
  $6000
  Moonshot (HP)
  
  $10000+
  Softiron 3000 (AMD seattle SOC)
  
  $2500
  Gigabyte MP30 (X-gene SOC)
  
  €950
  Hikey (96boards, Hisilicon/Lemaker)
  
   $75 (1G) $99(2G)
  Dragonboard 410c (96boards, Qualcomm)
  
  
  $75
  Pine64 (Allwinner A64)
  
  May 2016  $15(0.5GB) - $29 (2G)
  Benchmarking
  Benchmarking across architectures is difficult.
  What are good tests?
 
  
  Sometimes it's obvious (botch)
  
    | Arch | Build Time | 
    | amd64: | 37m | 
    | arm64 (generic ocaml): | 4hrs 52m | 
    | arm64 (native ocaml): | 1hr 15 | 
  
   
  Benchmarking
  Find good tests?
  Broadly equivalent platforms
  Look for changes over time