Reverse Engineering SmartLoader: Where Commodity Malware and Operational Tooling Diverge

Escrito por  Alice Duarte

Introduction

SmartLoader is a commodity loader currently in active deployment, attributed to Malware-as-a-Service infrastructure that has been linked to LummaStealer delivery in recent campaigns. The sample is interesting less for any single novel primitive and more for what it reveals about the development stack and tooling choices behind modern loaders: a stock LuaJIT interpreter as the execution substrate, a Lua VM obfuscator from the game hacking scene as the packer, a hand-rolled Win32 binding layer built on LuaJIT’s FFI, and a smart contract as the C2 resolver.

This post examines SmartLoader as a piece of malware engineering rather than as a defender’s work list. The IOCs are at the bottom for those who want them, but the body of the analysis is concerned with how the loader was built: which off-the-shelf components were used, which were written from scratch, where the development effort was spent, and how the resulting code architecture compares to modern offensive tooling baselines. Where relevant, each technique is placed against the current malware development ecosystem: commercial obfuscators, open-source malware development libraries, and modular C2 frameworks, to give a sense of where SmartLoader sits relative to the rest of the stack, and how the development techniques employed fare against the operational scope of tooling development versus “MaaS” goals and architecture.

The loader breaks into three stages: an initial execution and reconnaissance layer built on a signed third-party interpreter, blockchain-based C2 resolution and per-victim tasking, and a persistence layer that pulls stage-3 payloads from rotating GitHub repositories.


Stage 1 — LuaJIT as a Loader Substrate

1.1 LuaJIT as a Malware Development Substrate

The sample ships as a ZIP containing four files: a batch launcher (start luajit.exe lang.txt), a clean LuaJIT 2.1.0-beta3 interpreter, its runtime DLL with full FFI support, and a 354 KB obfuscated Lua payload.

Picking LuaJIT as a development platform for a loader is a non-obvious choice and worth unpacking. LuaJIT offers four things to a malware developer that are hard to assemble elsewhere in one package:

  1. A mature, production-quality FFI that maps directly to the Win32 C ABI without an intermediate native shim. Most embeddable languages either lack this entirely or expose it through a thin, awkward wrapper.
  2. A small, single-file runtime (lua51.dll) with no external dependencies and no installer footprint.
  3. A signed, widely distributed binary that already exists in many developer environments and triggers neither reputation nor “unknown PE” alarms.
  4. A JIT compiler that produces respectable performance, which matters when the loader is also running a large interpreted VM on top of the host VM.

This places SmartLoader inside the broader family of interpreted-language stagers, a development pattern that has been gaining traction over the last few years alongside the more visible Nim and Zig stagers. The closest direct parallels are embedded-Python loaders, the JavaScript / mshta payload family, and the Lua-based loaders that have appeared periodically in game hacking-adjacent malware. The defining property of the pattern is that the developer never ships their own native binary; instead execution is borrowed from a clean third-party runtime, and the developer’s code lives entirely as data the runtime consumes.

The development tradeoff is the bundle size. SmartLoader ships four files instead of one, which is more conspicuous on disk than a single packed PE. The payoff is a substantially smaller static-detection surface: no custom imports, no unusual section layout, no compiler signature.

It is important taking into account SmartLoader is a MaaS tool, and in this case most campaign targets are not the usual hardened corporate environments with mature EDRs, detection rules, and other defense layers; and instead, consumer-grade systems with an average user. Narrowing the targets allows the tooling development process to employ other techniques that would be ineffective against hardened environments, but perfectly fit their breadth-over-depth model. Being able to use different techniques not only eases the burden on development, but also reduces the complexity of constant adaptation to keep up with defensive environments found in typical red team operations, where a scope of precision and depth is preferred over breadth in most scenarios.

1.2 Commercial Lua Obfuscator Integration

The Lua payload is syntactically valid but heavily obfuscated. The entire script is wrapped in a closure containing an encrypted constant table and a small interpreter that walks an instruction stream. Code and strings remain encrypted until runtime and are decrypted only on access through an indexed accessor function.

This obfuscation profile is consistent with the Prometheus / IronBrew lineage, a family of commercial Lua VM obfuscators that grew out of the Roblox cheating ecosystem and has been productised over the last several years. Lua obfuscation has also been in use at the game cheating scene dating almost 10 years back; more recent iterations are evolutions on the older techniques, as well as expanding scope and a commercial model. The Lua-obfuscator scene has three main commercial products (Prometheus, IronBrew, Luraph), all sold on subscription pricing, all targeted primarily at game-cheat developers, and all of which produce output with the structural fingerprint visible here: a constant table, a dispatch loop, and encrypted strings unwrapped via an accessor function. They are conceptually equivalent to VMProtect / Themida / Enigma for native PEs, with similar pricing models and similar limitations on the kinds of analysis they actually defeat.

This is significant from a development perspective because the obfuscation layer is not something the malware authors wrote. It is bought, licensed, and integrated into the build pipeline. The development workflow is plausibly:

write_lua_source -> run_through_obfuscator -> bundle_with_luajit -> zip -> ship

That is materially different from a hand-rolled packer and has implications for analysis: defeating the obfuscation does not produce attributable code, only the de-virtualised version of code that ran through someone else’s product. The VM itself is the obfuscator vendor’s IP, not the malware author’s.

This is very commonly seen in the MaaS ecosystem, where crypters and packers are usually bundled along the main payload. For the red team operations landscape, such type of tooling is also essential in bypassing defenses, especially when dealing with experienced defenders attempting to reverse engineer your tools. There may be specific scenarios where binary size matters more than having your tools more anti-reversing, or maybe binaries with way too high entropy are flagged by some detection rule. For these specific scenarios, it is useful to have an umbrella of tools, each tailored to a specific use case. This is commonly seen in the modern tooling landscape, where large chains are built, usually starting with more minimalistic tooling than a more heavy-duty artifact, which may come in only during specific required scenarios.

The packer-VM design has a meaningful analytical property worth flagging. Older Lua packers built on loadstring() or load() can be defeated by hooking the loader and dumping plaintext at the moment of evaluation. A dispatch-VM packer has no such moment, since the plaintext only exists as the intermediate state of the emulator’s register file. Each VM instruction directly executes Lua operations via FFI, which means static analysis is reduced to either re-implementing the VM or instrumenting LuaJIT at runtime. The same property is what VMProtect and Themida sell to legitimate software vendors.

1.3 FFI as a Win32 Binding Layer

The VM’s first actions are require("ffi") and a single ffi.cdef call declaring the entire Windows surface area the loader will ever touch. The declarations cover PE header structures for both 32 and 64-bit images (IMAGE_DOS_HEADER, IMAGE_NT_HEADERS32/64, IMAGE_EXPORT_DIRECTORY), PEB structures (PEB, PEB_LDR_DATA, LDR_DATA_TABLE_ENTRY), GDI structures for screenshot capture, system info structures, and prototypes for VirtualAlloc, GetComputerNameW, Sleep, GetSystemMetrics, GetDC, BitBlt, CreateDIBSection, IsWow64Process, MessageBoxW, and several others.

From a development perspective this is the Win32 binding layer the developer chose to maintain in source. It is functionally equivalent to what you would write at the top of a C maldev project (#include <windows.h> plus a typedef block for the undocumented PEB structures), except expressed in Lua FFI’s C-declaration syntax. Both architectures’ PE structures are present in the same cdef block, alongside an IsWow64Process call during fingerprinting, which indicates the developer wrote a single architecture-aware loader rather than maintaining separate 32 and 64-bit builds. That choice scales: one Lua source file targets every Windows architecture the host LuaJIT can run on.

This will be further iterated once we touch the persistence mechanism present on SmartLoader, and help correlate the architectural decisions with the MaaS targeting scope. Usual operational tooling is also heavily versatile, but some decisions can’t be taken due to hardened environments. Previous enumeration of the target’s environments can also call for more narrow tooling to be deployed, frequently adjusted to match operational environment specific restrictions.

Immediately after FFI setup, ffi.C.GetConsoleWindow() followed by ffi.C.ShowWindow(hwnd, SW_HIDE) hides the console window. From this point on the user sees nothing.

1.4 Working Around LuaJIT FFI’s ABI Limits

LuaJIT FFI cannot directly access CPU segment registers, which is a hard floor in the FFI design since Lua has no concept of segment-relative addressing and the FFI’s type system has no syntax for it. The developer worked around this with a small RWX shellcode stub. VirtualAlloc is called with PAGE_EXECUTE_READWRITE and a 10-byte x64 stub is written into the allocation:

mov rax, gs:[0x60]    ; Read PEB pointer from GS segment
ret                   ; Return PEB* in RAX

The stub is cast to a function pointer via ffi.cast("PEB*(*)()", alloc) and invoked. The pattern executes 6 times across the loader, once for each target DLL module. A pseudo exploitation technique implemented for malware behavior.

This is a small but instructive piece of engineering. The developer hit a tooling limitation — a feature their chosen language doesn’t expose — and solved it with the minimum-viable bridg between Lua FFI and the one piece of CPU state the FFI cannot reach. Ten bytes of assembly is the entire bridge. The same architectural pattern is used by modern indirect-syscall implementations (drop a small RWX stub, populate registers, return), except here the stub bootstraps into the PEB rather than evading user-mode hooks on ntdll. The pattern is portable and it is one of the cleaner pieces of code in the loader.

Language specific barriers are common in operational tooling, along with environment-specific restrictions. This usually results in modular tools ready for adaption on-the-fly during operations, alongside a more complex, mature set of tools ready on disposal. Assessing if and when such tools should be used are a big part of modern operations, as lack of caution can trigger defense systems or personnel responses that could lead to catastrophic operational consequences, requiring heavier caution and risk management, especially in longer campaigns.

1.5 PEB-Walk as a Reusable Module

With the PEB pointer in hand, the VM walks PEB.Ldr.InMemoryOrderModuleList, traverses the export table of each target DLL, hashes export names, and casts each result to a callable function pointer via ffi.cast("FARPROC", addr). DLLs not loaded at process start (wininet.dll, shlwapi.dll, shell32.dll) are loaded at runtime via ntdll!LdrLoadDll, which sidesteps user-mode hooks placed on LoadLibrary*.

The resolution covers:

  • ntdll.dll: RtlInitUnicodeString, LdrLoadDll, RtlCreateUnicodeStringFromAsciiz
  • kernel32.dll: CreateThread, GetComputerNameW, VirtualProtectEx, WinExec, GetModuleHandleA, OpenMutexW
  • advapi32.dll: RegCreateKeyExW, RegSetValueExW, RegOpenKeyExW, RegQueryValueExW, RegEnumValueW, RegCloseKey, OpenProcessToken, GetTokenInformation
  • wininet.dll: InternetOpenW, InternetConnectW, HttpOpenRequestW, HttpSendRequestW, InternetReadFile, InternetOpenUrlW, InternetCloseHandle
  • shlwapi.dll: PathFileExistsW, SetFileAttributesW, CopyFileW
  • user32.dll / gdi32.dll: Screenshot APIs via direct ffi.C.* calls

The result is that luajit.exe‘s import table contains zero suspicious entries, GetProcAddress is never called, and the entire resolution chain is invisible to import-table scanners and to ETW image-load events on the malware’s own code path.

PEB walking with hashed exports is a well-known maldev pattern. It appears in every commodity loader tutorial, in the Maldev Academy curriculum, in countless GitHub repos (GetProcAddressR, GetModuleHandleR, GetSyscallStub), and in the runtime libraries shipped with several open-source C2 frameworks. The technique itself is roughly a decade old at this point. Mature 2026 maldev libraries have moved past it, typically combining PEB-walking export resolution with direct syscall number recovery (Hell’s Gate, Halo’s Gate, Tartarus’ Gate, FreshyCalls, and the various derivatives) and indirect syscall stubs that issue syscall instructions from within ntdll‘s own address space to defeat user-mode hooks.

SmartLoader does none of that. Every call routes through documented user-mode entrypoints (kernel32!CreateThread, wininet!InternetOpenW, advapi32!RegSetValueExW, etc.), every one of which is a hookable function and every one of which will produce ETW events on hosts with a properly configured EDR. From a development standpoint, the resolution layer is functionally a reimplementation translated into Lua FFI syntax. These design decisions make sense for the breadth-over-depth targeting employed in the current scope. Modern operational tooling must be very cautious with these implementations as they are highly likely to trigger some sort of detection event in hardened environments.

The technique is dated, but applying it inside a borrowed runtime makes the dated parts unobservable through the most common static-analysis tools. The interesting development lesson is that the choice of language sometimes matters more than the choice of technique. How these techniques are bundled and applied together also heavily influence detection patterns, especially when dealing with YARA rules that may not match for a specific branch implementation.

1.6 Anti-Analysis (Obfuscator-Provided)

The VM reads PEB.BeingDebugged, PEB.NtGlobalFlag, ProcessHeap.Flags, and ProcessHeap.ForceFlags via the shellcode-acquired pointer. Most analysis environments in 2026 strip them as a matter of course.

The mechanism that does work is self-checksumming. The loader checksums its own code sections at runtime, and if any byte has been replaced with 0xCC (INT3), the checksum fails and execution terminates with Tamper Detected!. The check runs during the obfuscator VM’s bootstrap, before any user-defined Lua code has a chance to execute, which means even a single software breakpoint placed by an analyst trips it. Only hardware breakpoints (ba e1) survive, since they use CPU debug registers instead of code patching.

Worth flagging that this is a feature of the obfuscation toolchain, not of the malware itself. Prometheus, IronBrew, and Luraph all advertise integrity-check options as part of their commercial feature set. The malware author did not write the checksum routine. They enabled a flag in the obfuscator and shipped the result. This is broadly consistent with how the rest of the loader is built: leverage commercial / open-source components where they exist, write only the glue.

Framework reutilization, modularity and flexibility, and tool stability are also common parts of operational tooling development. The commercial and MaaS scope present in this part is largely irrelevant, as this kind of tooling is almost part of a blueprint of every new offensive tooling project.

1.7 Reconnaissance Layer Implementation

The reconnaissance code collects the computer name via GetComputerNameW, screen resolution via GetSystemMetrics, architecture via IsWow64Process, OS version via VerifyVersionInfoW + VerSetConditionMask, admin status via OpenProcessToken + GetTokenInformation(TOKEN_ELEVATION), and geo-IP via an HTTP GET to ip-api.com/json/.

A single-instance mutex (CreateMutexW("xp30pub1tze8uisj")) guards against duplicate execution.

The loader then captures a full desktop screenshot via the GDI pipeline (GetDC(NULL) -> CreateCompatibleDC -> CreateDIBSection -> BitBlt(SRCCOPY)) and constructs a raw BMP with BITMAPFILEHEADER + BITMAPINFOHEADER. The uncompressed BMP is POSTed to http://178.17.59.88/api/<fingerprint> as multipart/form-data.

The fingerprint is encoded as comma-separated hex byte values, then base64-encoded to form the URL path. There is nothing engineering-noteworthy about any of this; the recon layer is the most boilerplate part of the loader and reads as a developer working through a standard checklist. GDI is the path of least resistance for the screenshot — DirectX or Desktop Duplication API would produce cleaner output but require substantially more binding work in FFI — and the result here is an uncompressed BMP large enough that someone watching network traffic would notice. The developer prioritized “works on first try” over “minimal bandwidth,” which aligns with the targeting scope chosen during development.


Stage 2 — Blockchain C2 Resolution

This is the section of the loader where the development sophistication ramps up sharply. The host-level code in stage 1 is recognizable maldev. Most people who have built a Windows loader has written most of those routines before. The C2 layer is built on a stack that most commodity-loader developers do not have hands-on experience with.

2.1 Blockchain C2 — the Required Dev Stack

After uploading the screenshot, the loader queries the Polygon (MATIC) blockchain to resolve the real stage-2 C2 URL. It calls InternetOpenW with proxy-aware settings, connects to polygon-rpc.com over HTTPS via InternetConnectW, and POSTs a standard eth_call JSON-RPC request:

{
  "jsonrpc": "2.0",
  "method": "eth_call",
  "params": [{
    "to": "0x1823A9a0Ec8e0C25dD957D0841e3D41a4474bAdc",
    "data": "0x3bc5de30"
  }, "latest"],
  "id": 1
}

eth_call is a read-only contract query requiring no wallet, no signing, and no gas fees from the caller. The data field is the function selector for getData(). The response is ABI-encoded: offset, length, then the hex-encoded ASCII string http://89.169.12.241.

What is structurally interesting about this architecture is not the blockchain layer in isolation but the underlying interaction model. There is no persistent C2 connection. The loader does not register with a server, does not hold a session, does not wait for inbound commands. It performs a single read against a public ledger, treats the response as configuration, and disconnects. The operator never directly addresses any specific loader instance: they update a value in a public contract, and the next time any deployed implant polls, it picks up the change.

This is a dead drop The pattern long predates malware: a publisher writes to a public medium, a reader polls that medium for new content, and the two parties never communicate directly. The blockchain implementation is the same pattern with a different storage medium; what changes is the medium’s takedown resistance, not the interaction model.

The interaction model matters because it determines what kind of operations the C2 is good for. A dead drop is asynchronous, unidirectional, and low-bandwidth. The operator can publish configuration; they cannot run an interactive shell, exfiltrate arbitrary data in real time, or pivot from one specific host to another. Anything beyond “here is the next thing to do, applicable to every victim” requires a separate channel.

That set of constraints maps cleanly onto MaaS economics. The commodity tier is built on breadth: many infections, each treated as fungible, no per-victim operator attention. The unit of value is the infection count, not access to any specific host. SmartLoader’s C2 architecture is exactly what that business model produces: broadcast configuration to every implant, run whatever the catalogue currently has staged, replace any individual host trivially. The dead drop fits because the operator does not need to talk to any specific victim.

The contrast with targeted-operation C2 is instructive. Targeted modern implants run on interactive, bidirectional, low-latency channels because the operator’s value comes from access to this particular host’s environment. The C2 must support live tasking, file transfer, lateral movement, session pivoting. None of those primitives map onto a dead drop without painful workarounds. The inverse holds too: nobody runs a 50,000-infection commodity campaign through interactive C2, because the operator-attention cost per host destroys the unit economics.

On a more hardened environment, the blockchain C2 model may also prove obsolete due to firewall rules and other tooling present restricting communications through only specific channels. A good example of these decisions being applied is the usage of Google Calendar as a C2 by APT41, further elaborated in this Google Threat Intelligence blogpost. These kinds of choices that affect communication opsec are crucial for modern hardened environments, where stealth and persistence matter heavily.

The blockchain layer is therefore doing two jobs at once. It is the resilience layer, which is what most published analysis focuses on. But it is also the interaction-model layer, and the choice of dead drop is what makes the loader architecturally compatible with MaaS breadth targeting in the first place. A bidirectional C2 with comparable takedown resistance would be a much harder build; SmartLoader’s authors did not have to build it because the business model did not require it.

The economic argument for choosing Polygon over Ethereum mainnet is straightforward: near-zero gas fees (~$0.001 per write), 2-second block times, and freely available public RPC endpoints. The contract creator wallet shows 9 setData() updates over 116 days ago as of the time of analysis.

The fully-loaded cost of running C2 infrastructure resilient to law-enforcement takedown comes out at roughly $0.10 across a four-month campaign. That is not a typo, it is that cheap and easy. Updating the C2 URL costs the operator approximately $0.01 per transaction, and there is no domain to register, no hosting to subpoena, and no single point of failure.

The specific implementation in this strain is already broken. The hardcoded dependency on polygon-rpc.com means that as of February 2026 the eth_call never completes against the live endpoint, and the resolved stage-2 URL is never returned through normal execution. Reaching the rest of the loader’s logic during analysis therefore required substituting the dead transport. Rather than re-host the RPC or hand-patch every response in a debugger, the eth_call return path was driven through symbolic execution: the JSON-RPC client’s response-handling routines were modelled symbolically with the contract return value as the only constrained input, which produced concrete execution traces through the downstream decoder, the path check on GetModuleFileNameW, and into the task-decryption routine. The patched return value was then re-injected at runtime to reach the same state without further instrumentation.

This has implications for any future strain. The Ankr deprecation forced every existing deployment offline simultaneously, which means the next build necessarily uses a different RPC endpoint and almost certainly ships with updated VM keys and a new payload format prepared for them. The analysis path documented here is specific to this strain; the symbolic execution scaffold survives, but the patched values do not.

2.2 The Wallet as Build-Time State

The deployer wallet is, in effect, part of the build artifacts. It contains the private key that authorizes C2 rotation; without it, the entire deployed fleet is orphaned. From a development standpoint this is a piece of operational state that has to be managed across the team: backed up, possibly shared, possibly held by a single individual.

It is possible to track the campaign’s deployment timeline from the contract’s transaction history. The earliest writes show the developer testing the contract by setting the data field to 127.0.0.1, followed shortly after by the first instance of a functional, internet-facing C2 address.

The transaction shown below was the most recent and produced the currently running C2 domain. The rotation pattern across the 9 transactions is consistent with the GitHub-repo rotation observed in stage 3; both are baseline MaaS hygiene.

2.3 The Hardcoded-RPC Failure Mode

polygon-rpc.com was operated by Ankr as a free public RPC gateway for Polygon. Ankr deprecated unauthenticated access and shut down the endpoint entirely on February 16, 2026. The loader hardcodes the URL with no API key and no fallback, so every call now returns HTTP 403 with tenant disabled. The smart contract itself is still functional, but the RPC transport is dead.

This is a development-process failure worth dwelling on. The contract layer is genuinely resilient at the protocol level. The transport layer used to read the contract is not. A hardcoded RPC URL with no fallback list is a single point of failure that bricks every deployed instance of the loader the moment a free-tier provider changes its terms. The fix is mechanically trivial — point to one of several other free Polygon RPCs, rotate through a list, or use a public RPC aggregator. The fix is operationally non-trivial because it requires pushing an updated build to every existing implant, which is exactly the problem that the resilient C2 layer was supposed to solve.

There is also a forward-looking analytical consequence. Because every currently deployed implant fails at C2 resolution, the campaign is effectively dormant until either a new build is pushed or every infected host runs through enough persistence cycles to give up. If a new build appears, it almost certainly uses a different RPC endpoint, and may also ship with updated VM keys and a new final-payload format and decryption routine. The current strain is not forward-compatible with the next.

2.4 Tasking, Feature Flags, and Environmental Keying

After resolving the stage-2 URL, the loader POSTs the screenshot and fingerprint to http://89.169.12.241/api/<fingerprint> and receives an encrypted JSON response containing two base64-encoded fields:

{
  "loader": "<base64 encrypted config>",
  "tasks": "<base64 encrypted task array>"
}

The loader checks GetModuleFileNameW and only decrypts the response when running from the persistence directory (path contains ODMw). On first run from the original location, the response is cached to %USERPROFILE%\Documents\<MachineGuid>.json but not processed.

This is environmental keying implemented as a runtime check on the executable path. The pattern sits in the same family as domain-joined checks, hostname allowlists, victim-specific RC4 keys derived from the volume serial number, and the various “only decrypt if %COMPUTERNAME% matches” tricks seen across the maldev ecosystem. The implementation is on the simpler end — checking that the path string contains a substring is one strstr away from triviality — but the analytical effect is the same: first-run sandbox execution from a sandbox-controlled path never reaches the decryption routine, so automated pipelines that do not install persistence and re-execute from the installed location never see the second-stage behaviour.

With GetModuleFileNameW patched to lie about the current path, the C2 response decrypts cleanly:

{
  "bypass_defender": 0,
  "autorun": 0,
  "persistence": 1,
  "hide": 0,
  "relaunch": { "time": -1, "status": false },
  "tablet": { "text": "An error occurred", "status": false }
}

The configuration model is the most interesting piece of development pattern in the loader. Each field is a server-controlled toggle that can be enabled per-victim without re-shipping the binary:

  • bypass_defender — Defender-evasion path (currently disabled; mechanism unknown).
  • autorun — Registry Run-key persistence in addition to scheduled tasks.
  • persistence — Directory copy and schtasks installation.
  • tablet — Decoy error dialog shown to the user.
  • relaunch — Scheduled relaunch on a configurable interval.
  • hide — Window / process visibility flag.

This is feature flagging, the same SaaS development pattern that web platforms use for staged rollouts and A/B testing — applied to a malware framework. The developer wrote one binary that supports several behaviours, gated each behaviour behind a server-controlled flag, and exposed those flags through what is presumably an operator panel. The architectural consequence is that the binary on disk no longer fully describes the loader’s behaviour; behaviour is a function of the binary plus the server’s current configuration for that victim.

Modular C2 frameworks (Sliver, Havoc, Mythic) all use variants of this pattern. The interesting part here is the application to the loader stage itself rather than to the post-exploitation agent. Most commodity loaders have a fixed behaviour per build; SmartLoader has a per-victim behaviour graph driven by a feature-flag schema.

This also gives us an insight about SmartLoader: despite its name indicating it is a “loader”, this behavior along with the persistence mechanism constantly reaching out to fetch a new payload from the C2 server indicates it has usage as a type of implant that is just responsible to keep delivering malware available on the C2’s catalog every time it contacts the C2.

The presence of bypass_defender as a disabled-but-present field implies an evasion module that exists in the framework but was not enabled for this campaign either because the campaign’s targets did not warrant burning it, or because the feature is gated behind another specific payload instead of the one delivered, or other unknown factors.


Stage 3 — Persistence and Payload Delivery

3.1 Persistence — Minimum Viable Implementation

When persistence: 1, the loader checks via PathFileExistsW whether the persistence directory already exists. If any component is missing, it calls SHCreateDirectoryExW to create %LOCALAPPDATA%\ODMw\, then CopyFileW three times to copy the interpreter as ODMw.exe, the runtime DLL, and the Lua payload as ang.txt.

It then creates a scheduled task via WinExec:

schtasks /create /sc daily /st <random_time> /f /tn AmazonCloudDrive_ODMw
  /tr ""ODMw.exe" "ang.txt""

The task name masquerades as Amazon Cloud Drive. The execution time is randomised via math.random. A marker file at %TEMP%\reader.exeODMw prevents reinstallation on subsequent runs.

All 8 registry APIs (RegCreateKeyExW, RegSetValueExW, RegOpenKeyExW, RegQueryValueExW, RegEnumValueW, RegCloseKey) are resolved and ready. The server can enable registry Run-key persistence per-victim by setting autorun: 1.

In development-effort terms this is the cheapest possible persistence module: shell out to schtasks.exe via WinExec, log the path, return. There is no use of the ITaskService COM interface (which would avoid spawning schtasks.exe and produce no command-line telemetry), no WMI event subscription (which evades scheduled-task enumeration entirely), no COM hijack, no BITS job, no AppInit DLL — none of the more exotic registry-resident loaders.

The contrast with the stage 2 C2 layer is informative. The persistence module is what gets written when a developer is optimising for time-to-ship over evasion quality. The C2 module is what gets written when a developer is optimising for survival against takedown. Those two priorities are not the same priority, and the development investment was allocated accordingly.

3.2 Payload Delivery via GitHub Raw

For each entry in the decrypted tasks array, the loader checks a completion marker (PathFileExistsW on %TEMP%\a.luaODMw), downloads the payload via InternetOpenUrlW from a rotating GitHub raw URL, writes it to %TEMP%\a.lua via io.open, and executes it via WinExec("ODMw.exe a.lua", SW_HIDE). A completion marker is then created and the bot reports back to http://89.169.12.241/task/<fingerprint> with the encrypted task ID.

Two GitHub repositories were observed during analysis:

  • Shonieeee/codebytes/1024.txt
  • rianahmedrony/pedbytes/3.txt

As of the time of writing, the Shonieeee account was deleted or banned from GitHub. The repository naming pattern (code, ped) and file naming pattern (bytes/N.txt) is consistent across both accounts and almost certainly extends to others not yet observed. Evidence gathered prior to account deactivation showed identical layout in both repositories — a .txt file containing the encoded payload and a .log file used for telemetry or staging.

GitHub raw URLs are the cheapest CDN a developer can wire into a loader and require zero infrastructure on the developer’s side. The development workflow is plausibly:

build_payload -> encode -> push_to_repo -> update_C2_task_URL_via_eth_call

Each step is a few seconds of work. The repository naturally rotates as accounts get reported and banned; the C2 task list updates through one of the cheap setData() calls visible in the contract’s transaction history. The rotation cadence visible across the 9 contract updates over 116 days is consistent with this lifecycle — constant infrastructure churn.

3.3 Framework Architecture and the Unshipped Features

The task entries contain two fields worth examining even though they are not currently used:

  • dll_loader: with type: "LoadLibrary", indicating the framework supports DLL-format payloads loaded via LdrLoadDll or other alternative techniques.
  • pump: a binary-padding field intended to inflate the size of downloaded payloads to evade size-based AV heuristics.

Both are currently disabled. Their presence in the task schema is consistent with the rest of the framework’s design: Features are scaffolded into the schema first, implemented incrementally, and gated behind config flags so that disabled features cost nothing at runtime. This is recognizable as ordinary product-development practice: design the configuration surface first, ship modules against it over multiple iterations.

The 32 and 64-bit PE structures present in the ffi.cdef block from stage 1, combined with IsWow64Process architecture detection, VirtualProtectEx for memory permission changes, CreateThread with LPTHREAD_START_ROUTINE, and the explicit dll_loader configuration in the task schema, all point in the same direction: the framework is designed as a multi-format payload runner, not as a Lua-only executor.

The current stage-3 payload is hex-encoded text. The plaintext is almost certainly Lua source code encrypted with the same VM cipher used for the string table in the parent script — the byte distribution of the hex-encoded payload rules out a raw PE, which would use the full byte range. However, the decrypted Lua source could embed a PE that it drops or maps into memory at runtime, which is consistent with the framework’s stated multi-payload capability.

Recent attribution work has tied SmartLoader deployments to LummaStealer delivery. On that assumption, the .txt payload fetched from GitHub is most likely a packaged version of the LummaStealer reflective loader, encoded into whatever stager format the SmartLoader interpreter expects. The daily scheduled task means each infected host fetches and runs whatever the operator currently has staged. The host is, in effect, a pull-based dropper substrate for the rest of the MaaS catalogue. Architecturally, the loader is a thin runtime for someone else’s payloads.


Closing Thoughts: SmartLoader and the Modern Tooling Landscape

SmartLoader is a useful case study because the development investment across its layers is so visibly uneven, and the unevenness is legible once you know what the loader is for. It is not a sophisticated implant trying to survive on a hardened corporate network. It is a commodity dropper that needs to keep working at scale against consumer-grade systems while the C2 backend serves rotating payloads from whatever the MaaS catalogue currently has staged. Every architectural decision in the loader reads as a direct expression of that requirement: effort is concentrated where the MaaS business model actually exposes risk, and spared everywhere else.

The contrast with modern operational tooling is where this gets interesting. Red-team and APT-grade implants are built under a fundamentally different set of constraints. The target environment is hardened, the EDR is configured by someone who has read the ETW documentation, the firewall enforces egress restrictions that rule out direct blockchain RPC calls entirely, and the operator is being paid for access to a specific environment rather than for raw infection counts. Every technique SmartLoader gets away with by virtue of its targeting scope — documented user-mode entrypoints, scheduled tasks spawned through schtasks.exe, plaintext HTTP exfiltration, and unobfuscated GDI screenshot capture — would be operational wreckage on a serious engagement. The modern operational tooling landscape is therefore evolving in the opposite direction: indirect syscalls, COM-based persistence that never spawns a child process, custom protocol tunneling over channels that survive egress filtering, payload formats designed to defeat the specific EDR vendors known to be present on the target. The two trees of development are growing apart on most fronts, even as they share the underlying maldev primitives. Something however interesting noted and also proved every time, but again with SmartLoader, is that old techniques by themselves may be outdated, but when you combine several things together under specific circumstances, languages, etc, they become functionally relevant again.

What SmartLoader actually demonstrates very drastically is how modern tooling not only evolved, but operational-grade tooling has needed to go to lengths much further to counteract contemporary EDRs and other defense solutions. Not everything that works for a MaaS targeting scope will always work, and neither every operational technique. It’s always about finding the balance between that one technique that works for your current target.


IOCs

Network

  • 178.17.59.88 — Stage-1 C2 (screenshot upload, plaintext HTTP)
  • 89.169.12.241 — Stage-2 C2 (tasking, plaintext HTTP, resolved from blockchain)
  • 89.169.12.160 — Stage-2 C2 (previous, replaced)
  • 93.123.39.74, 84.21.189.135 — Earlier stage-2 C2 addresses (observed in transaction history)
  • polygon-rpc.com — Blockchain RPC endpoint (legitimate Ankr service, abused; deprecated for unauthenticated access Feb 2026)
  • ip-api.com — Geo-IP service (legitimate, abused)
  • User-Agent: Chrome/140.0.0.0

Blockchain

  • Contract: 0x1823A9a0Ec8e0C25dD957D0841e3D41a4474bAdc (Polygon)
  • Function: getData() (selector 0x3bc5de30)
  • Creator wallet: 0xdE275aD38C3352A7cb6b0d3efcBF45900c9716f2
  • Funder wallet: 0xF9Bd8BAD...8362529eA
  • Attribution pivot: 888 @vipTron888_bot Telegram tokens held by creator wallet

GitHub Repositories

  • Shonieeee/codebytes/1024.txt (account suspended at time of writing)
  • rianahmedrony/pedbytes/3.txt, fo/3.txt, fo/55.txt

Host

  • Persistence directory: %LOCALAPPDATA%\ODMw\
  • Persistence binary: ODMw.exe (renamed luajit.exe)
  • Persistence payload: ang.txt (renamed lang.txt)
  • Scheduled task: AmazonCloudDrive_ODMw (daily, randomised time)
  • Mutex: xp30pub1tze8uisj
  • Stage-3 payload: %TEMP%\a.lua
  • C2 cache: %USERPROFILE%\Documents\<MachineGuid>.json
  • Markers: %TEMP%\a.luaODMw, %TEMP%\reader.exeODMw

Logo da Hakai.