What's broken on ARM64?
				  Wookey
				  wookey@wookware.org
				
  Who is Wookey?
  
    - Debian Developer, ARM porter
      Embedded/multiarch/crossbuild/bootstrap 
    - Linaro, seconded from ARM
 
  
  Ramble through state
  
    - Supposed to be BOF-y
 
    - Built, working, optimised, missing
 
    - Hardware availability
 
    - Benchmarking, what next
 
  
  I'm a build engineer
  If it builds, it's done.
  Mostly Done
  
    - Things people care about work
 
    - Some things still need porting
 
  
  But How well do they work?
  
    - Some things may not work at all...
 
    - Popular things are optimised
 
    - Many things are not
 
    - Need feedback from real users
 
  
  Some History
  
    - Started 2010 - toolchains
 
    - 'Fast' Model 2011
 
    - Qemu 2013
 
    - Hardware March 2014
 
  
  5 bootstraps
  
    - Crossbuilt internal bootstrap 2011
 
    - Crossbuilt ubuntu bootstrap 2012
 
    - Ubuntu rebootstrap 2013/14
 
    - Debian ports rebootstrap 2014
 
    - Debian official rebootstrap 2014/15
 
  
  Now
  
    - 96% built (11331 packages)
 
    - 303 failed
 
    - 377 not for us
 
  
  Languages
  
    
    - Optimised
 
    
      - C, C++, Java (8&9), Python, Perl, Ocaml, Javascript (v8), Haskell (ghc), Lisp (SBCL) (2 months)
 
    
    - Ported
 
    
      - Lua, R, Rust, Golang, Julia (2 months), Perl6 (4 months),
Javascript, Pascal (fpc) (at FOSDEM),
 
    
    - Missing
 
    
      - Mono, libphobos (D), Luajit, ?
 
    
  
  Still missing
  
    - Mono (23)
 
    - libphobos (D-libraries) (11)
 
  
  Mono/C#
  
    - Blocks 23 packages
 
    - Done but not released
 
    - Does anyone care enough?
 
    - Coreclr instead? (not in Debian yet)
 
  
  Not missing
  
    - Nodejs - libv8 
 
    - golang - (base support in v1.5)
 
    - Lua (but no luajit)
 
    - Ocaml (native in 4.02, 2015.07)
 
    - SBCL (2015.10)
 
    - Perl6 (2015.10)
 
    - Julia (2016.01)
 
    - fpc (2016.02)
 
  
  Optimisation
  
    - C fallback
 
    - Assembler, Intrinsics, Neon
 
  
  Assembler in packages
  
    1000 bits of assembler:
    - https://wiki.linaro.org/Platform/DevPlatform/ArmSoftwareList
 
    - http://performance.linaro.org/
 
  
  
Removing often better than 'fixing' (e.g. alsalib)
  
  What's Optimised
  
    - GCC (compilers)
 
    - LLVM (no linker) (compiler)
 
    - Java 8 & 9
 
    - openSSL
 
    - libV8 (Javascript jit)
 
    - fftw3 (FFT library) Neon support
 
    - gnu-mp (gmp) (Feb 2013)
 
    - hadoop (crc, using HWCAP)
 
    - ceph (crc, using HWCAP)
 
    - Kernel (raid6, crypto)
 
    - Xen, (loads less code on arm64)
 
  
  Not Optimised
  
    - Ionmonkey (Mozilla JIT)
 
    - Golang (being worked on)
 
    - OpenCV
 
    - ffmpeg/libav (parts)
 
    - Openoffice
 
    - Lua (luajit)
 
    - R?
 
    - ATLAS?
 
    - What else?
 
  
  Blocking Packages (Debian)
  
    - mono (23), libphobos (11)
 
    - openais (6)
 
    - pocl (6)
 
    - dietlibc (5)
 
    - ffcall (5)
 
    - insighttoolkit (3)
 
    - timblserver (2)
 
    - umview (3)
 
    - kexec-tools (2)
 
  
  
  https://people.debian.org/~wookey/bootstrap/blocked-deps-list
  Never Built
    ('Auto Not for us')
  
  
  - For other arches
 
  
    - nvidia-support
 
    - nvramtool
 
    - powerpc-utils
 
    - raxml
 
  
  - Probably should be fixed
 
  
    - openafs
 
    - scm
 
    - qtemu
 
    - darktable
 
    - lots more
 
  
  
   
  https://buildd.debian.org/status/architecture.php?a=arm64
  I have no hardware!
    Online
  
    - OVH (Runabove.com) ThunderX
 
    - OBS (Open Build System)
 
    - Linaro build farm
 
    - Debian porter boxes
 
  
  I have no hardware!
    Hardware
  
    - HP Moonshot
 
    - ARM juno
 
    - APM C1
 
    - Softiron 3000
 
    - Gigabyte MP30
 
    - Hikey (96boards)
 
    - Dragonboard 410c(96boards)
 
    - Cello/Husky
 
    - Pine64
 
  
  Applied Micro X-gene
  
  $1500
  Juno (ARM A57)
  
  $6000
  Moonshot (HP)
  
  $10000+
  Softiron 3000 (AMD A1100)
  
  $2500
  Gigabyte MP30 (X-gene SOC)
  
  €950
  Hikey (96boards, Hisilicon/Lemaker)
  
   $75 (1G) $99(2G)
  Dragonboard 410c (96boards, Qualcomm)
  
  
  $75
  Cello (AMD A1100
  
  April 2016?  $300
  Pine64 (Allwinner A64)
  
  May 2016?  $15(0.5GB) - $29 (2G)
  Benchmarking
  Benchmarking across architectures is difficult.
  What are good tests?
 
  
  Sometimes it's obvious (botch)
  
    | Arch | Build Time | 
    | amd64: | 37m | 
    | arm64 (generic ocaml): | 4hrs 52m | 
    | arm64 (native ocaml): | 1hr 15 | 
  
   
  Benchmarking
  Find good tests?
  Broadly equivalent platforms
  Look for changes over time