I had some issues getting Hashcat running on my local machine and started digging around into GPU acceleration. My main goal was to run some cryptanalysis on offline hashes, but I had previously been confined to doing this in a Virtual Machine running ParrotOS. I don’t mind Parrot, as everything pretty much works out of the box and it has many drivers for cheap WiFi chipsets. However, the performance greatly suffers (depending on your VM configuration) and I wanted to use GPU accel anyways.
The project I found was Radeon Open Compute (ROCm), which is CUDA for AMD cards that run on the newer AMDGPU Linux driver. I have been consistently impressed with AMDGPU’s performance on Linux, so I continue to buy Radeon graphics cards even when the 1080 Ti/1660 GTX were being released. The card I am using as an RX580, but I hope to upgrade to Navi or Radeon VII when those become more available. After installing ROCm on my machine, I had many issues getting Hashcat to recognize it as an OpenCL environment. Hashcat would either stall or segfault, with little evidence of what I was doing wrong.
I’ll walk you through what I did to get things working, and try to show you the pitfalls to avoid.
I’m using Gentoo Linux, but you can use Ubuntu or Fedora, whatever you want. ROCm needs some binaries installed (from their GitHub or your package manager) as well as a custom Linux kernel built that supports some yet-to-be-merged kernel patches for the AMDGPU driver.
I don’t think you need all of these packages, but here’s what I have installed:
$ eix -cI "(amdgpu|rocm|hashcat)"
[I] app-crypt/hashcat-utils (1.9@06/09/2019): a set of small utilities that are useful in advanced password cracking
[I] dev-libs/rocm-cmake [1] (9999@06/10/2019): ROCm-CMake
[I] dev-libs/rocm-opencl-driver [1] (2.2.0@06/11/2019): ROCm-OpenCL-Driver
[I] dev-libs/rocm-opencl-runtime [1] (2.5.0@06/10/2019): ROCm-OpenCL-Runtime
[I] dev-util/rocm-smi [1] (2.5.0@06/16/2019 -> (~)9999): ROCm System Management Interface
[I] dev-util/rocminfo [1] (9999@06/12/2019): ROCm Application for Reporting System Info
[I] sys-devel/amd-rocm-meta [1] (2.5.0(2.4)@06/11/2019): Meta package for ROCm
[I] sys-kernel/rocm-sources [1] (2.5.9999(2.5)@06/11/2019): ROCm kernel sources
[I] x11-drivers/xf86-video-amdgpu (19.0.1@07/17/2019): Accelerated Open Source driver for AMDGPU cards
You could install rocm-smi
, rocminfo
, and rocm-sources
from GitHub, but if you also use Gentoo (or Funtoo), justxi’s portage overlay was plenty up-to-date for me.
I am using the hashcat-utils
for some of the useful binaries it installs, I am not using my repository’s hashcat
package (more on this later).
$ equery f hashcat-utils
* Searching for hashcat-utils ...
* Contents of app-crypt/hashcat-utils-1.9:
/usr
/usr/bin
/usr/bin/cap2hccapx
/usr/bin/cleanup-rules
/usr/bin/combinator
/usr/bin/combinator3
/usr/bin/combipow
/usr/bin/ct3_to_ntlm
/usr/bin/cutb
/usr/bin/expander
/usr/bin/gate
/usr/bin/generate-rules
/usr/bin/hcstat2gen
/usr/bin/hcstatgen
/usr/bin/keyspace
/usr/bin/len
/usr/bin/mli2
/usr/bin/morph
/usr/bin/permute
/usr/bin/permute_exist
/usr/bin/prepare
/usr/bin/req-exclude
/usr/bin/req-include
/usr/bin/rli
/usr/bin/rli2
/usr/bin/rules_optimize
/usr/bin/splitlen
/usr/bin/strip-bsn
/usr/bin/strip-bsr
If you’ve never setup a Linux kernel before, I wouldn’t worry too much as this is going to be pretty straightforward. However, if you have anything weird in your kernel (like RAID, special drivers you need to boot your OS, etc) this method may need some tweaking. In general, we’re going to use the ROCm team’s base kernel config, compile a kernel, and copy it to our boot partition and update grub. After booting into the new kernel, Hashcat will see ROCm’s amdgpu-kfd OpenCL implementation and use the kernel driver to access our GPU resources.
As always, make sure you have a backup of anything super important, there is potential to mess up your machine and need to boot to a recovery environment to restore your boot partition.
Either obtain the rocm-sources from GitHub or (with gentoo overlay) package manager. Navigate to the sources directory (with Makefile, etc) and run as root:
# make rock-rel_defconfig
# make nconfig && make -sj9 && make modules_install && make install
This will bring you to an ncurses kernel config menu that will allow you to make any kernel tweaks, then you can save / exit with F9
, and it will start compiling.
You can tweak the make -j
option to be your # of logical CPUs + 1 to speed up compilation.
After it’s done compiling your kernel, it will install modules to /lib/modules
and build initramfs.
Lastly, it will copy your new Linux kernel image and initramfs to /boot
.
Next, you need to rebuild the grub boot menu:
# cd /boot
# ls
# cd grub
# grub-mkconfig -o grub.cfg
I recommend you run the commands shown, to see what images are on your boot partition before you create the grub file.
I like to have an assumption of what it will create, then make sure it adds what I expected to the grub config.
You don’t want any surprises and get locked out of your OS.
One caveat here, depending on your Linux distribution (Ubuntu, Fedora, Gentoo, etc) your /boot
partition layout will be different, and you may even be booting directly without grub in which case you probably know what to do.
The main things to look out for are if it’s /boot/grub
or /boot/grub2
, and if it’s grub-mkconfig
or grub2-mkconfig
.
Once you reboot, make sure you press the arrow keys at the grub menu to get to advanced options.
Boot to the newly created ROCm kernel, which for me was Gentoo GNU/Linux, with Linux x86_64-5.0.0-rc1-kfd+
.
I would first try your package manager’s Hashcat, but I had issues with hashcat-5.1.0
only seeing Mesa OpenCL, and segfaulting even if I tried to run with --force
.
If that doesn’t work, or you want to be on the latest version, you can install Hashcat from GitHub, build and (optionally) install it.
I recommend putting it in /opt/hashcat
and symlinking the built binary to /usr/local/bin/hashcat
.
Now, you can run hashcat -I
to verify Hashcat sees your GPU:
$ hashcat -I
hashcat (v5.1.0-1243-gd1f473d6+) starting...
OpenCL Info:
============
OpenCL Platform ID #1
Vendor..: Advanced Micro Devices, Inc.
Name....: AMD Accelerated Parallel Processing
Version.: OpenCL 2.0 AMD-APP.internal.dbg (2901.0)
Backend Device ID #1
Type...........: GPU
Vendor.ID......: 1
Vendor.........: Advanced Micro Devices, Inc.
Name...........: gfx803
Version........: OpenCL 1.2
Processor(s)...: 36
Clock..........: 1365
Memory.........: 6963/8192 MB allocatable
OpenCL.Version.: OpenCL C 2.0
Driver.Version.: 2901.0 (HSA1.1,LC)
You can run one of the included example scripts to test its functionality:
$ ./example400.sh
hashcat (v5.1.0-1243-gd1f473d6+) starting...
OpenCL API (OpenCL 2.0 AMD-APP.internal.dbg (2901.0)) - Platform #1 [Advanced Micro Devices, Inc.]
==================================================================================================
* Device #1: gfx803, 6963/8192 MB allocatable, 36MCU
Hashes: 1 digests; 1 unique digests, 1 unique salts
Bitmaps: 16 bits, 65536 entries, 0x0000ffff mask, 262144 bytes, 5/13 rotates
Rules: 1
Applicable optimizers:
* Zero-Byte
* Single-Hash
* Single-Salt
Minimum password length supported by kernel: 0
Maximum password length supported by kernel: 256
ATTENTION! Pure (unoptimized) backend kernels selected.
Using pure kernels enables cracking longer passwords but for the price of drastically reduced performance.
If you want to switch to optimized backend kernels, append -O to your commandline.
See the above message to find out about the exact limits.
Watchdog: Temperature abort trigger set to 90c
Host memory required for this attack: 696 MB
Starting attack in stdin mode...
$H$9y5boZ2wsUlgl2tI6b5PrRoADzYfXD1:hash234
Session..........: hashcat
Status...........: Cracked
Hash.Name........: phpass
Hash.Target......: $H$9y5boZ2wsUlgl2tI6b5PrRoADzYfXD1
Time.Started.....: Sun Jul 21 20:44:08 2019 (1 sec)
Time.Estimated...: Sun Jul 21 20:44:09 2019 (0 secs)
Guess.Base.......: Pipe
Speed.#1.........: 0 H/s (7.11ms) @ Accel:1024 Loops:1024 Thr:64 Vec:1
Recovered........: 1/1 (100.00%) Digests
Progress.........: 128416
Rejected.........: 0
Restore.Point....: 0
Restore.Sub.#1...: Salt:0 Amplifier:0-0 Iteration:0-1024
Candidates.#1....: [Copying]
Hardware.Mon.#1..: Temp: 61c Fan: 40% Core:1365MHz Mem:2000MHz Bus:0
Started: Sun Jul 21 20:44:01 2019
Stopped: Sun Jul 21 20:44:09 2019
Lastly, I’d get rocm-smi
and rocminfo
to keep an eye on your hardware resources and see more detailed info.
These utilities do work without using the custom kernel, in case you just wanted to see resource info:
========================ROCm System Management Interface========================
================================================================================
GPU Temp AvgPwr SCLK MCLK Fan Perf PwrCap VRAM% GPU%
0 74.0c 145.067W 1365Mhz 2000Mhz 60.0% auto 145.0W 33% 100%
================================================================================
$ ./rocminfo
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
==========
HSA Agents
==========
*******
Agent 1
*******
Name: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz
Marketing Name: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 5000
BDFID: 0
Internal Node ID: 0
Compute Unit: 16
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 32869532(0x1f58c9c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Acessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 32869532(0x1f58c9c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Acessible by all: TRUE
ISA Info:
N/A
*******
Agent 2
*******
Name: gfx803
Marketing Name: Ellesmere [Radeon RX 470/480/570/570X/580/580X]
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 4096(0x1000)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
Chip ID: 26591(0x67df)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 1365
BDFID: 256
Internal Node ID: 1
Compute Unit: 36
SIMDs per CU: 4
Shader Engines: 4
Shader Arrs. per Eng.: 1
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH
Fast F16 Operation: FALSE
Wavefront Size: 64(0x40)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 40(0x28)
Max Work-item Per CU: 2560(0xa00)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 8388608(0x800000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Acessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Acessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx803
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***
Thanks for reading and happy cracking!