What is Hachi?

Hachi

MELD Announces Hachi!

Hachi is our effort to develop a suite of security standards and tooling for Cardano Smart Contracts. This project will be open-sourced to the public eventually. For now, we focus on various research and experiments to better understand how to have the best approach for each problem.

MELD has been funding Hachi on our own, but we hope the community will join us to develop Hachi as a community-driven project. The road ahead is nothing but pure excitement and creativity!

Hachi’s First Day on Alonzo Purple

After nearly two months since Hachi’s inception, today is our first day on an Alonzo testnet! Before that, we spent more effort mastering Plutus Core, from its formal specifications, representations, compilation pipeline to the interpreters, cost models, etc. Another key focus is on Blockchain and Smart Contract Security in general. It is not trivial since Cardano has a unique tech stack, architecture and is still in the early days of Smart Contracts support. We’ve proposed over 20 different security tooling and service ideas and have been working hard to build a foundation for future research and development. We look forward to publishing our first Hachi White Paper in September. Stay tuned!

Back to the main topic, let’s go through our first day on Alonzo Purple.

Node Setup

We have good experience building and running Cardano nodes, so it is relatively straightforward to set up Alonzo Purple. We compiled the node from source (@57cbbc9c58b9bc9f3fc54c116b742d1c4be72e79), get the config files, then set up a quick startup script:

#!/bin/bash
NODE_CONFIG=testnet
DIRECTORY=~/MELD/${NODE_CONFIG}-node
TOPOLOGY=${DIRECTORY}/${NODE_CONFIG}-topology.json
DB_PATH=${DIRECTORY}/db
SOCKET_PATH=${DIRECTORY}/db/socket
CONFIG=${DIRECTORY}/${NODE_CONFIG}-config.json
cardano-node run --topology ${TOPOLOGY} --database-path ${DB_PATH} --socket-path ${SOCKET_PATH} --config ${CONFIG}

Chain Index

While waiting for the node to sync, we continued to experiment with the new plutus-chain-index component for efficient script/datum/redeemer storage and queries for security research. To develop our Dapp, we have been following numerous Chain Index PRs from the IOHK team lately and become familiar with this codebase.

Testnet nodes sync quickly, so we left a quick PR improving CLI config handling before getting back to the synced node.

The PR includes:

  • Further distinguish chain index & logging configs.
  • Support print default configs to file.

Previously there is an unhandled DumpDefaultConfig CLI command, with a few misdescriptions between the two config types.

We are going to write more about Chain Index in our Plutus documentation effort:

We also made an attempt to introduce this component into plutus-starter:

Monitor Scripts & Data

Cardano CLI

Coming back to the synced node, the first thing that came to our mind was to query data with cardano-cli.

We first checked the ledger state:

cardano-cli query ledger-state --testnet-magic 1097911063

Then look at the utxos:

cardano-cli query utxo --whole-utxo --testnet-magic 1097911063

We ran into an interesting error with query utxo:

cardano-cli: <stdout>: commitAndReleaseBuffer: invalid argument (invalid character)

It looks like an encoding bug on the presentation side that has already been reported. We may open a PR to fix this issue later.

We continued with this grep that shows Datums, a hint that contracts have already been deployed!

$ cardano-cli query ledger-state --testnet-magic 1097911063 | grep 'datahash": "'

"datahash": "b1141c05ac58491c724c1b223ed01f7bca96ed7ea396c6f194b61d917b65b2d0",
"datahash": "8f42ba9d7761a7fad6198bffc4659c8be0d688b5a744a3fe2580f3e35f66d8a3",
"datahash": "fcaa61fb85676101d9e3398a484674e71c45c3fd41b492682f3b0054f4cf3273",
"datahash": "fcaa61fb85676101d9e3398a484674e71c45c3fd41b492682f3b0054f4cf3273",
"datahash": "fcaa61fb85676101d9e3398a484674e71c45c3fd41b492682f3b0054f4cf3273",
"datahash": "4932dce28712ccc4858e3d83cc8e79b12740f66007dc9a287bb640a264c899de",
"datahash": "93978341ed4f41d3a7e7afa234e31c4e8cde68f54bd342885f421103c2c2ba02",
"datahash": "f10cf66e117391cd8c37494e5e6527e506cfdd176be72d6b396c68348a7a069e",
"datahash": "d70d3db4f4f2f9e3321d36c80ddfa9db32c3c2b210349ee21c52a1f02843ed91",
"datahash": "9ad30ffde0d1931ed4f145fa0a0d320a067051bfab1b08cbdb79e9f26df55df3",
"datahash": "93978341ed4f41d3a7e7afa234e31c4e8cde68f54bd342885f421103c2c2ba02",
"datahash": "93978341ed4f41d3a7e7afa234e31c4e8cde68f54bd342885f421103c2c2ba02",
"datahash": "fcaa61fb85676101d9e3398a484674e71c45c3fd41b492682f3b0054f4cf3273",
"datahash": "8f42ba9d7761a7fad6198bffc4659c8be0d688b5a744a3fe2580f3e35f66d8a3",
"datahash": "a9be31977ee0a75de1efdeae66b3e49aabec3d20f61ac487d262a3a5aad28ba5",
"datahash": "fcaa61fb85676101d9e3398a484674e71c45c3fd41b492682f3b0054f4cf3273",
"datahash": "9ad30ffde0d1931ed4f145fa0a0d320a067051bfab1b08cbdb79e9f26df55df3",
"datahash": "ebfe829ec8d2c43031b365765332b73268352e2a9e1db7fccb3be08490865c3c",
"datahash": "fcaa61fb85676101d9e3398a484674e71c45c3fd41b492682f3b0054f4cf3273",
"datahash": "310be1f47c9d5bffc8de44cb930450ac2b82a82d3c7230f8f927b53a4d670087",
"datahash": "93978341ed4f41d3a7e7afa234e31c4e8cde68f54bd342885f421103c2c2ba02",
"datahash": "a23dee4ad091c8c4ffb93e16f8d90cf5a513952d136c88c0f17892338f8a206d",

We could see a reasonable number of similar scripts deployed to the Testnet on the first day of public Alonzo Purple.

Haskell Syncer

Since plutus-chain-index wasn’t quite ready yet, we quickly prototyped our own syncer to get scripts, datums, redeemers, and all important transaction data for security research.

We started from customizing this demo program:

The first thing to do is update the socketPath and localNodeNetworkId configs to match our node setup.

-  let socketPath = socketDir </> "node.sock"
+  let socketPath = socketDir </> "socket"
-        localNodeNetworkId       = Mainnet,
+        localNodeNetworkId       = Testnet (NetworkMagic 1097911063),

We then pattern match only on Alonzo blocks for further processing:

clientStNext =
  ClientStNext {
    recvMsgRollForward = \b _ -> ChainSyncClient $ do
      case b of
        BlockInMode block AlonzoEraInCardanoMode -> ...
        -- ^ process the Alonzo block here!
...

We continue to pattern match on the Block with Block _ transactions then:

ShelleyTx ShelleyBasedEraAlonzo (Alonzo.ValidatedTx _ (Alonzo.TxWitness _ _  scripts datums redeemers) _ _)

For the transaction. Then for each script:

(ScriptHash h, Alonzo.PlutusScript sbs)

We then store the script bytestring, relevant datums, and redeemers to local files for further queries. We can now track all deployed contracts that can be case studies for Hachi tools with this setup. This process will be further significantly improved with Chain Index.

Script Representation

A Plutus script can be represented in many ways. On one end, we have the raw bytes found above. On another, we have the original high-level Haskell code. In security, finding what to do on which representation is an exciting endeavor filled with creativity. We should not analyze raw bytes directly as they are too unstructured for many techniques to operate on. At the same time, source code level analysis depends too much on a subset of large and diverse API designs and has to trust the compilation pipeline by default. It might not be easy to port Haskell-specific tools to new source languages in the future, either.

With that in mind, we have been experimenting on the more in-between representations, preferably on the low-end, for more precision. Here are a few we have worked on today.

Plutus.V1.Ledger.Scripts.Script

This is usually our first go-to representation, easily constructed from the script bytestring:

bs <- BS.readFile script.raw
case scriptFromBytes (SBS.toShort bs) of
  Left deserialiseErr -> ...
  Right script -> ...
  -- ^ This is what we want

With the following helper functions:

deserialiseBytes ::
  Serialise a => SBS.ShortByteString -> Either DeserialiseFailure a
deserialiseBytes = deserialiseOrFail . LBS.fromStrict . SBS.fromShort

scriptFromBytes ::
  SBS.ShortByteString -> Either DeserialiseFailure Script.Script
scriptFromBytes = deserialiseBytes @Script.Script

We can then print the Untyped Plutus Core AST with:

print (Script.unScript script)
Program () (Version () 1 0 0) (Apply () (Apply () (LamAbs () (DeBruijn {dbnIndex = 0}) (LamAbs () (DeBruijn {dbnIndex = 0}) (LamAbs () (DeBruijn {dbnIndex = 0}) (LamAbs () (DeBruijn {dbnIndex = 0}) (LamAbs () (DeBruijn {dbnIndex = 0}) (Var () (DeBruijn {dbnIndex = 5}))))))) (Delay () (LamAbs () (DeBruijn {dbnIndex = 0}) (Var () (DeBruijn {dbnIndex = 1}))))) (LamAbs () (DeBruijn {dbnIndex = 0}) (Var () (DeBruijn {dbnIndex = 1}))))

We can also apply arguments to the script:

let appliedScript = Script.applyArguments script [PLC.I 0x48656c6c6f21, PLC.I 2, PLC.I 3]

And evaluate the script, before or after argument application:

case Script.mkTermToEvaluate s of
  Left freeVariableErr -> ...
  Right (UPLC.Program _ _ term) ->
    print $ Cek.evaluateCekNoEmit PLC.defaultCekParameters term
Right (Delay () (LamAbs () (Name {nameString = "", nameUnique = Unique {unUnique = 5}}) (Var () (Name {nameString = "", nameUnique = Unique {unUnique = 5}}))))

Pretty-print

We can also pretty-print a script with:

print (pretty (Script.unScript script))
(program 1.0.0
  [ [ (lam  (lam  (lam  (lam  (lam  ))))) (delay (lam  )) ] (lam  ) ]
)

Which can get non-pretty quickly as the script gets more complex. For example, a Hello World program looks like this:

(program 1.0.0
  [
    [
      [
        (lam

          (lam

            (lam

              [
                (lam

                  [
                    (lam

                      [
                        (lam

                          [
                            [
                              (lam

                                (lam

                                  [
                                    (lam

                                      [
                                        (lam

                                          (lam

                                            (lam

                                              (lam

                                                (force
                                                  [
                                                    [
                                                      (force
                                                        [

                                                          [
                                                            [ [ (force )  ]  ]
                                                            (force )
                                                          ]
                                                        ]
                                                      )
                                                      (delay )
                                                    ]
                                                    (delay [ (force )  ])
                                                  ]
                                                )
                                              )
                                            )
                                          )
                                        )
                                        (delay
                                          [
                                            (builtin bData)
                                            (con
                                              bytestring
                                                #48656c6c6f20576f726c6421
                                            )
                                          ]
                                        )
                                      ]
                                    )
                                    (delay
                                      (lam

                                        [
                                          (force )
                                          [ (force [   ]) (con unit ()) ]
                                        ]
                                      )
                                    )
                                  ]
                                )
                              )
                              (delay (lam  ))
                            ]
                            (lam  )
                          ]
                        )
                        (delay (lam  (error)))
                      ]
                    )
                    (lam

                      (lam

                        [
                          [
                            [
                              (force (builtin ifThenElse))
                              [ [ (builtin equalsData)  ]  ]
                            ]

                          ]

                        ]
                      )
                    )
                  ]
                )
                (delay (lam  ))
              ]
            )
          )
        )
        (delay (lam  (lam  )))
      ]
      (delay (lam  (lam  )))
    ]
    (lam  )
  ]
)

That said, pretty-printed programs are good source code for further transpilers and compilers, especially for small languages with concise specifications like Plutus Core. For example, as demonstrated in previous reports, we have been working on a Racket transpiler to utilize the more abundant tooling there. plutus-core also provides a few CLI programs like pir, plc, and uplc for interaction with this representation.

Previously we had little trouble working with pretty-printed code that we compile from Haskell. But today’s pretty-printed DeBruijn raw programs don’t seem to follow the formal specifications. To progress faster, we extended our transpiler to support printed Term AST from Haskell:

case Script.mkTermToEvaluate s of
  Left freeVariableErr -> ...
  Right (UPLC.Program _ _ term) -> ...
  --                      ^ This term
Apply () (Apply () (LamAbs () (Name {nameString = "", nameUnique = Unique {unUnique = 0}}) (LamAbs () (Name {nameString = "", nameUnique = Unique {unUnique = 1}}) (LamAbs () (Name {nameString = "", nameUnique = Unique {unUnique = 2}}) (LamAbs () (Name {nameString = "", nameUnique = Unique {unUnique = 3}}) (LamAbs () (Name {nameString = "", nameUnique = Unique {unUnique = 4}}) (Var () (Name {nameString = "", nameUnique = Unique {unUnique = 0}}))))))) (Delay () (LamAbs () (Name {nameString = "", nameUnique = Unique {unUnique = 5}}) (Var () (Name {nameString = "", nameUnique = Unique {unUnique = 5}}))))) (LamAbs () (Name {nameString = "", nameUnique = Unique {unUnique = 6}}) (Var () (Name {nameString = "", nameUnique = Unique {unUnique = 6}})))

We apply an empty [] to make a smaller term for the transpiler to work with:

Right (LamAbs () (Name {nameString = "", nameUnique = Unique {unUnique = 2}}) (LamAbs () (Name {nameString = "", nameUnique = Unique {unUnique = 3}}) (LamAbs () (Name {nameString = "", nameUnique = Unique {unUnique = 4}}) (Delay () (LamAbs () (Name {nameString = "", nameUnique = Unique {unUnique = 5}}) (Var () (Name {nameString = "", nameUnique = Unique {unUnique = 5}})))))))

Racket

We have been developing a Racket transpiler with helper functions to evaluate the original Plutus Core programs and solvers to find data that bypasses the script in Racket.

Always True:

cat AlwaysTrue.term | hc
(λ (v2) (λ (v3) (λ (v4) (delay (λ (v5) v5)))))

Hello World:

cat HelloWorld.term | hc
(λ (v10) (λ (v11) (λ (v12) (force (((force ((λ (v24) v24) ((((λ (v19) v19) (λ (v17) (λ (v18) ((((force (ifThenElse)) (((equalsData) v17) v18)) (delay (λ (v20) (λ (v21) v20)))) (delay (λ (v22) (λ (v23) v23))))))) v10) ((bData) (bytes #x48 #x65 #x6C #x6C #x6F))))) (delay (delay (λ (v14) v14)))) (delay ((λ (v13) ((λ (v16) (assert #f "Validate failed")) ((force ((λ (v15) v15) v13)) unitval))) (delay (λ (v14) v14)))))))))

We use visual graphs of flattened ASTs to make it more human-friendly to reason about:

Hello World AST

We are still working to improve our transpiler and will write more about it soon.

Target #1: Always True

DISCLAIMER: The following content is not actual exploitation. We will not publish vulnerabilities of non-trivial scripts but will try our best to report to related projects. Please do the same if you try to replicate or do something similar.

We now move onto our first exploitation experiment: The Always True script. It is a validator script that always approves transactions that spend its outputs.

mkValidator :: BuiltinData -> BuiltinData -> BuiltinData -> ()
mkValidator _ _ _ = ()
(program 1.0.0
  [ [ (lam  (lam  (lam  (lam  (lam  ))))) (delay (lam  )) ] (lam  ) ]
)
Program () (Version () 1 0 0) (Apply () (Apply () (LamAbs () (DeBruijn {dbnIndex = 0}) (LamAbs () (DeBruijn {dbnIndex = 0}) (LamAbs () (DeBruijn {dbnIndex = 0}) (LamAbs () (DeBruijn {dbnIndex = 0}) (LamAbs () (DeBruijn {dbnIndex = 0}) (Var () (DeBruijn {dbnIndex = 5}))))))) (Delay () (LamAbs () (DeBruijn {dbnIndex = 0}) (Var () (DeBruijn {dbnIndex = 1}))))) (LamAbs () (DeBruijn {dbnIndex = 0}) (Var () (DeBruijn {dbnIndex = 1}))))
(λ (v2) (λ (v3) (λ (v4) (delay (λ (v5) v5)))))

There are multiple ways to track the on-chain script. Our favorite thus far is to find the address from the on-chain script bytes. First, further serialize and write the script to an envelope:

writeFileTextEnvelope @(PlutusScript PlutusScriptV1) "AlwaysTrue.plutus" Nothing (PlutusScriptSerialised (SBS.toShort bs))
{
  "type": "PlutusScriptV1",
  "description": "",
  "cborHex": "4e4d01000033222220051200120011"
}

We then get the on-chain address of the script:

$ cardano-cli address build --testnet-magic 1097911063 --payment-script-file AlwaysTrue.plutus
addr_test1wpnlxv2xv9a9ucvnvzqakwepzl9ltx7jzgm53av2e9ncv4sysemm8

Then we find the target utxos:

$ cardano-cli query utxo --testnet-magic 1097911063 --address addr_test1wpnlxv2xv9a9ucvnvzqakwepzl9ltx7jzgm53av2e9ncv4sysemm8
                           TxHash                                 TxIx        Amount
--------------------------------------------------------------------------------------
22909f2a82cdf271cc09824f8f2082910553694d33fae9c8d75abba22b4a8fb9     1        150000000 lovelace + TxOutDatumHash ScriptDataInAlonzoEra "b1141c05ac58491c724c1b223ed01f7bca96ed7ea396c6f194b61d917b65b2d0"
4ad1845d3a5cb6cce7d166caca7c7dcd139fa0aa832656fac798e422da3fcdf0     1        10000000 lovelace + TxOutDatumHash ScriptDataInAlonzoEra "a23dee4ad091c8c4ffb93e16f8d90cf5a513952d136c88c0f17892338f8a206d"
5f2ef2880f24a1111b7e9474e5e3b11c15538000052904e2d8c022e8de5e06e7     0        770000000 lovelace + TxOutDatumHash ScriptDataInAlonzoEra "fcaa61fb85676101d9e3398a484674e71c45c3fd41b492682f3b0054f4cf3273"
6b439ca8bdb069d7d7bc203ba5b61c2b27ff8c6cca4767864860ea11120c0db8     0        770000000 lovelace + TxOutDatumHash ScriptDataInAlonzoEra "fcaa61fb85676101d9e3398a484674e71c45c3fd41b492682f3b0054f4cf3273"
a21766d6f0a6ce93386243217329cffdbcb1e1f8c291fa2a0c4b2132c0b62dee     1        1000000000 lovelace + TxOutDatumHash ScriptDataInAlonzoEra "fcaa61fb85676101d9e3398a484674e71c45c3fd41b492682f3b0054f4cf3273"
a6752e5260757103cfa63f8346f3e5c705523c24fef443cea850dc6316e85fa3     1        1000000000 lovelace + TxOutDatumHash ScriptDataInAlonzoEra "fcaa61fb85676101d9e3398a484674e71c45c3fd41b492682f3b0054f4cf3273"
d1afee56aecb50b4ba7d5ced38ca92f43a2c17af443867a9aa9b0b2847d40377     0        770000000 lovelace + TxOutDatumHash ScriptDataInAlonzoEra "fcaa61fb85676101d9e3398a484674e71c45c3fd41b492682f3b0054f4cf3273"
e56089a6c1f6e18bb8b22f17970938004797b4ee5dcf55da547d639b83015634     0        770000000 lovelace + TxOutDatumHash ScriptDataInAlonzoEra "fcaa61fb85676101d9e3398a484674e71c45c3fd41b492682f3b0054f4cf3273"
e70da16d28c4ba234498b2c6222284076a88bab94d7223f1a27ff1f597308454     0        770000000 lovelace + TxOutDatumHash ScriptDataInAlonzoEra "fcaa61fb85676101d9e3398a484674e71c45c3fd41b492682f3b0054f4cf3273"
edebfe352ef225edf03fe254ae26141c45b530478d4353db45aa28c4eea1acde     0        16661337 lovelace + TxOutDatumHashNone
f852223988fba77af7752c8edad4203ffb19a0472b62e59ceff7ddb39877a9aa     1        20000000 lovelace + TxOutDatumHash ScriptDataInAlonzoEra "310be1f47c9d5bffc8de44cb930450ac2b82a82d3c7230f8f927b53a4d670087"

We can easily see the repeated 770000000 lovelace + TxOutDatumHash ScriptDataInAlonzoEra "fcaa61fb85676101d9e3398a484674e71c45c3fd41b492682f3b0054f4cf3273" entries here, which is indeed our first target.

From our syncer, we can see that the Datum of these is:

TxDatsRaw (fromList [(SafeHash "fcaa61fb85676101d9e3398a484674e71c45c3fd41b492682f3b0054f4cf3273",DataConstr Constr 0 [I 42])])

And below is the redeemer:

RedeemersRaw (fromList [(RdmrPtr Spend 0,(DataConstr Constr 0 [I 42],ExUnits {exUnitsMem = 1700, exUnitsSteps = 476468}))])

So a withdrawal transaction can be set up quite easily:

export TARGET_UTXO=5f2ef2880f24a1111b7e9474e5e3b11c15538000052904e2d8c022e8de5e06e7#0
export COLLATERAL=b9d171e1c7109a11fefa72b5c2eeb5555f00e153ddbe37a28bef61a7cd1f20d0#0

cardano-cli transaction build \
  --alonzo-era \
  --tx-in ${TARGET_UTXO} \
  --tx-in-script-file original.plutus \
  --tx-in-datum-file typed-42.json \
  --tx-in-redeemer-file typed-42.json \
  --tx-in-collateral ${COLLATERAL} \
  --change-address $(cat payment.addr) \
  --protocol-params-file pparams.json \
  --testnet-magic 1097911063 \
  --out-file tx.raw

We initially had some trouble with the JSON encoding of Plutus Core Data. DataConstr Constr 0 [I 42] is elementary to read and construct in Haskell. But we had no idea how to get the correct JSON, with cardano-cli transaction hash-script-data, --tx-in-datum, and --tx-in-redeemer all require data in JSON format. After a few minutes we got bored with the trial-and-errors and just searched the pattern online, to find the source at input-output-hk/cardano-node@f1cbf4e209b58f22213e0cbc2b862daabda353d2. We then learned the correct way in input-output-hk/cardano-node/scripts/plutus/data/typed-42.datum:

{"constructor":0,"fields":[{"int":42}]}

We tried a different one in:

TxDatsRaw (fromList [(SafeHash "4932dce28712ccc4858e3d83cc8e79b12740f66007dc9a287bb640a264c899de",DataConstr I 1337)])
RedeemersRaw (fromList [(RdmrPtr Spend 0,(DataConstr I 42666,ExUnits {exUnitsMem = 2500, exUnitsSteps = 1993190}))])
export TARGET_UTXO=741c9ac1df066995ee52d00492bd825f220f883ce3e459dea18d4dd60256e52d#0

cardano-cli transaction build \
  --alonzo-era \
  --tx-in ${TARGET_UTXO} \
  --tx-in-script-file original.plutus \
  --tx-in-datum-value 137 \
  --tx-in-redeemer-value 42666 \
  --tx-in-collateral ${COLLATERAL} \
  --change-address $(cat payment.addr) \
  --protocol-params-file pparams.json \
  --testnet-magic 1097911063 \
  --out-file tx.raw

And again, we withdrew the fund successfully. We then called it a decent first day on Alonzo Purple and finalized it with this blog.

What’s Next?

We still have a lot of work to do and explore.

  1. Continue to analyze deployed scripts on the public testnet. We’re eyeing the Hello World, Guess Game, and a few Oracles and DEXes out there. It’s fascinating to see brand new Datum structures now and then!
  2. Keep monitoring the progress of plutus-chain-index and help when possible. All for a stable and efficient syncer/DB for security work.
  3. Continue to deep dive into Plutus, further inspect the problem with pretty-printed raw DeBruijn programs, offer help if appropriate.
  4. Continue to deep dive into the current Plutus interpreters. Find vulnerabilities in the interpreters (a sane way to bypass even the AlwaysFails validator script). Plan instrumentation support for fuzzers. Offer help when appropriate.
  5. Continue to push the Racket transpiler, interpreter, and all the tooling and techniques on that end. With other sub-projects like Hachi Lint, static analysis with XML trees & XPath.
  6. Continue to ponder on the Bounty Program from the Cardano Foundation.
  7. Write and publish more White Papers and technical reports.