ArKit is fun but in my limited experience with it you can't achieve the full facial performance of a hand keyed or mocap performance. I'm not planning to use ArKit in my project so I'm not polishing this.
Unreal range of motion anim
Maya range of motion anim

Key each pose to each frame (53 total). Export animation to unreal. Convert animation to Pose asset. Name each line according to apple ArKit.
https://developer.apple.com/documentation/arkit/arfaceanchor/blendshapelocation
Morphing in a different face, the blendshapes still hold together well.
I had a large improvement in fps as seen in the lower right corner.
The first clip is straight Metahumans, the second clip is the converted blendshape setup.
I want to start by saying the release of metahumans is incredible, the detail they have achieved in their rigs are great. Using a bone facial system is great for rotating vertices around curved volumes.
I've been wanting to learn more about facial blendshape work and the metahumans release has been a boon. Unfortunately, for my uses 12 skin influences, 397-713 joints, and 669 blendshapes is too heavy for me.
I converted the metahumans bone rig shapes and rebuilt the logic and combinations solely as a blendshape rig (using bones for the eyes alone). I was getting around 16 fps in maya using metahumans and around 80-100 in maya using the blendshape rig. In this setup 262 blendshapes and 9 bones were used.
ArKit is easier than I expected to use. You can use any custom head setup by exporting a ROM animation, following apples blendshape guide, and creating a Pose Asset from the animation and labeling each frame to the corresponding pose.
***To be clear, this is a rebuild of the metahumans bone rig to a pure blendshape rig. The model, hair, textures, and shapes are all from metahumans, the materials are as well (the face material is adjusted to make the blend masks work with my custom rig).