Research & Publications

Complete overview of research projects and academic publications

Research Projects

Blockchain Scalability

Following the success of the cryptocurrencies, blockchain has recently evolved into a technology platform that powers secure, decentralized, and consistent transaction ledgers at Internet-scale. However, the performance bottleneck remains one of the most critical challenges of current blockchains. In the standard Nakamoto consensus, the performance is bottlenecked by the facts 1) that only one participant can win the competition and contribute to the blockchain, i.e., concurrent blocks are discarded as forks, and 2) that the slowness is essential to defend against adversaries. For example, Bitcoin generates one 1MB block every 10 minutes and can therefore only process 7 transactions per second. The insufficient throughput and long confirmation delay severely limit the adoptions of blockchain techniques, causing poor user experience, congested network, and skyrocketing transaction fees.

Conflux

Conflux is a new fast, scalable, and decentralized blockchain system that can process thousands of transactions per second while confirming each transaction in minutes. Conflux records additional informations between blocks and organizes generated blocks into direct acyclic graphs. The core of Conflux is its consensus protocol that allows multiple participants to contribute to the Conflux blockchain concurrently (i.e., processing transactions in all concurrent blocks) while still being provably safe. See the Conflux project website for more information.

Shrec

Shrec is a novel transaction relay protocol for high-throughput blockchain systems built around a hybrid transaction hashing scheme that has a low hash collision rate, is resilient to collision attacks, and is fast to construct. Shrec reduces the bandwidth consumption by 60% at modest CPU overhead and improves the system throughput by up to 90%.

Gosig

Gosig is a scalable Byzantine consensus protocol for consortium blockchains. Gosig uses transmission pipelining to fully utilize the network bandwidth and uses aggregated signature gossip to reduce the number of messages. It can scale up to 5000 nodes while still maintaining the throughput of 3000 transactions per second.

Smart Contract Security

Smart contract is a program that encodes a set of transaction rules. Once deployed to a blockchain, its encoded rules are enforced by all participants of the blockchain network, and therefore it eliminates counterparty risks in sophisticated transactions. Unfortunately, like other programs, smart contracts may contain errors. Errors inside smart contracts often lead to significant financial losses in the real world.

Solythesis

Solythesis is a source-to-source runtime validation tool for Solidity. Its design is based on the observation that smart contract execution is not the performance bottleneck of blockchain systems. The overhead of runtime validation, which is often too expensive for other domains, is in fact negligible for smart contracts.

Automatic Patch Generation

Software defects are pervasive in software systems and can cause undesirable user experience, denial of service, or even security exploitation. Generating a patch for a defect is a tedious, time-consuming, and often repetitive process. Automatic patch generation techniques holds out the promise of automatically correcting software defects without the need for human developers to diagnose, understand, and correct these defects. To learn more, please visit our project website!

Prophet

Prophet is the state-of-art generate-and-validate patch generation system for C programs. It is the first system that uses machine learning techniques to learn from past successful human patches to recognize and predict correct patches for new errors.

SPR

SPR is the baseline system on which Prophet is built. It uses the condition synthesis technique to explore its search space up to two magnitude faster.

CodePhage

CodePhage systematically transfers useful security checks from a donor application to eliminate bugs and security vulnerabilities in a recipient application. It is the first system that transfers useful code across applications. It does not even require the source code of the donor application!

Input Filtering and Rectification

What if we cannot change the source code of an application? Let's look at the inputs of the application. We can make sure that malicious input cannot reach the application, i.e., filter them or rectify them.

SIFT

SIFT is a sound input filter system with sophisticated program analysis techniques. It guarantees to filter out all malicious inputs that trigger critical integer overflow errors. In practice, it also has zero to negligible false positives.

SOAP

SOAP is the first automatic input rectification system. It enforces a set of inferred invariants on the inputs so that potentially malicious inputs are transformed to benign inputs.

Program Recovery

What if an application crashes during its execution and we only have its binary? We can use our recovery shepherding technique to enable the application to survive the error triggering input unit and recovers its execution.

RCV

RCV is a lightweight program recovery tool with negligible overhead during normal execution. When a crash error (null-dereference and/or divide-by-zero) occurs, it systematically guides the application execution to survive the error triggering input unit. It also tracks how the error propagates in the application and waits until the error is flushed away after the program moves to the next input unit. Instead of crash and getting nothing, you can get part or all of your desired results.

Full Publication List

TypyBench: Evaluating LLM Type Inference for Untyped Python Repositories

Honghua Dong, Jiacheng Yang, Xun Deng, Yuhe Jiang, Gennady Pekhimenko, Fan Long, Xujie Si

ICML 2025

[PDF]

Demystifying Invariant Effectiveness for Securing Smart Contracts

Zhiyang Chen, Ye Liu, Sidi Mohamed Beillahi, Yi Li, and Fan Long

FSE 2024

[PDF]

Safeguarding Defi Smart Contracts Against Oracle Deviations

Xun Deng, Sidi Mohamed Beillahi, Cyrus Minwalla, Han Du, Andreas Veneris, and Fan Long

ICSE 2024 (Distinguished Paper)

[PDF]

Flashsyn: Flash Loan Attack Synthesis via Counter Example Driven Approximation

Zhiyang Chen, Sidi Mohamed Beillahi, and Fan Long

ICSE 2024

[PDF]

LVMT: An Efficient Authenticated Storage for Blockchain

Chenxing Li, Sidi Mohamed Beillahi, Guang Yang, Ming Wu, Wei Xu, and Fan Long

USENIX OSDI 2023

[PDF]

A Robust Front-running Methodology for Malicious Flash-Loan Defi Attacks

Xun Deng, Zihan Zhao, Sidi Mohamed Beillahi, Han Du, Cyrus Minwalla, Keerthi Nelaturu, Andreas Veneris, and Fan Long

IEEE DAPPS 2023

[PDF]

Möbius: an Atomic State Sharding Design for Account-based Blockchains

Srisht Fateh Singh, Panagiotis Michalopoulos, Sidi Mohamed Beillahi, Andreas Veneris, and Fan Long

IEEE ICBC 2023

[PDF]

Mercury: Fast Transaction Broadcast in High Performance Blockchain Systems

Mingxun Zhou, Liyi Zeng, Yilin Han, Peilun Li, Fan Long, Dong Zhou, Ivan Beschastnikh, and Ming Wu

IEEE INFOCOM 2023

[PDF]

SigVM: Enabling Event-Driven Execution for Autonomous Smart Contracts

Zihan Zhao, Sidi Mohamed Beillahi, Ryan Song, Yuxi Cai, Andreas Veneris, and Fan Long

SPLASH/OOPSLA 2022

[PDF]

Utilizing Parallelism in Smart Contracts on Decentralized Blockchains by Taming Application-Inherent Conflicts

Peter Garamvolgyi, Yuxi Liu, Dong Zhou, Fan Long, Ming Wu

ICSE 2022

[PDF]

Automatic Horizontal Fusion for GPU Kernels

Ao Li, Bojian Zhang, Gennady Pekhimenko, Fan Long

CGO 2022

[PDF]

LMPTs: Eliminating Storage Bottlenecks for Processing Blockchain Transactions

Jemin Andrew Choi, Sidi Mohamed Beillahi, Peilun Li, Andreas Veneris, Fan Long

IEEE ICBC 2022 (Best Paper)

[PDF]

Automated Auditing of Price Gouging TOD Vulnerabilities in Smart Contracts

Sidi Mohamed Beillahi, Eric Keilty, Keerthi Nelaturu, Andreas Veneris, Fan Long

IEEE ICBC 2022

[PDF]

PipeZK: Accelerating Zero-Knowledge Proof with a Pipelined Architecture

Ye Zhang, Shuo Wang, Xian Zhang, Jiangbin Dong, Xingzhong Mao, Fan Long, Cong Wang, Dong Zhou, Mingyu Guo, Guangyu Sun

ISCA 2021

[PDF]

Smart Contract Refinement for Gas Optimization

Keerthi Nelaturu, Sidi Mohamed Beillahi, Fan Long, Andreas Veneris

IEEE BRAIN 2021

[PDF]

Shrec: Bandwidth-Efficient Transaction Relay in High-Throughput Blockchain Systems

Yilin Han, Chenxing Li, Peilun Li, Ming Wu, Dong Zhou, and Fan Long

SoCC 2020

[PDF]

Gosig: A Scalable and High-Performance Byzantine Consensus for Consortium Blockchains

Peilun Li, Guosai Wang, Xiaoqi Chen, Fan Long, and Wei Xu

SoCC 2020

[PDF]

Engineering Economics in the Conflux Network

Yuxi Cai, Fan Long, Andreas Park, and Andreas Veneris

IEEE BRAIN 2020

[PDF]

A Decentralized Blockchain with High Throughput and Fast Confirmation

Chenxing Li, Peilun Li, Dong Zhou, Zhe Yang, Ming Wu, Guang Yang, Wei Xu, Fan Long, and Andrew Chi-Chih Yao

USENIX ATC 2020

[PDF]

Securing Smart Contract with Runtime Validation

Ao Li, Jemin Andrew Choi, and Fan Long

PLDI 2020

[PDF]

Automatic Patch Generation via Learning from Successful Human Patches

Fan Long

PhD Thesis

[PDF]

Automatic Inference of Code Transforms for Patch Generation

Fan Long, Peter Amidon, and Martin Rinard

FSE 2017

[PDF] [Slides] [Artifact]

CodeCarbonCopy

Stelios Sidiroglou-Douskos, Eric Lahtinen, Anthony Eden, Fan Long, and Martin Rinard

FSE 2017

[PDF]

An Analysis of the Search Spaces for Generate and Validate Patch Generation Systems

Fan Long and Martin Rinard

ICSE 2016

[PDF] [Slides] [Artifact]

Automatic Patch Generation by Learning Correct Code

Fan Long and Martin Rinard

POPL 2016

[PDF] [Slides] [Artifact]

Control Jujutsu: On the Weaknesses of Fine-Grained Control Flow Integrity

Isaac Evans, Fan Long, Ulziibayar Otgonbaatar, Howard Shrobe, Martin Rinard, Hamed Okhravi, and Stelios Sidiroglou-Douskos

CCS 2015

[PDF] [Slides]

Staged Program Repair with Condition Synthesis

Fan Long and Martin Rinard

ESEC-FSE 2015

[PDF] [Slides] [Artifact]

An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems

Zichao Qi, Fan Long, Sara Achour, and Martin Rinard

ISSTA 2015

[PDF] [Artifact]

Automatic Error Elimination by Multi-Application Code Transfer

Stelios Sidiroglou, Eric Lahtinen, Fan Long, and Martin Rinard

PLDI 2015

[PDF]

Automatic Integer Overflow Discovery Using Goal-Directed Conditional Branch Enforcement

Stelios Sidiroglou, Eric Lahtinen, Nathan Rittenhouse, Paolo Piselli, Fan Long, Doekhwan Kim, and Martin Rinard

ASPLOS 2015

[PDF]

Principled Sampling for Anomaly Detection

Brendan Juba, Christopher Musco, Fan Long, Stelios Sidiroglou, and Martin Rinard

NDSS 2015

[PDF]

Automatic Runtime Error Repair and Containment via Recovery Shepherding

Fan Long, Stelios Sidiroglou, and Martin Rinard

PLDI 2014

[PDF] [Slides]

Sound Input Filter Generation for Integer Overflow Errors

Fan Long, Stelios Sidiroglou, Deokhwan Kim, and Martin Rinard

POPL 2014

[PDF] [Slides]

From Natural Language Specifications to Program Input Parsers

Tao Lei, Fan Long, Regina Barzilay, and Martin Rinard

ACL 2013

[PDF]

Automatic Input Rectification

Fan Long, Vijay Ganesh, Michael Carbin, Stelios Sidiroglou, and Martin Rinard

ICSE 2012

[PDF]

G2: A Graph Processing System for Diagnosing Distributed Systems

Zhenyu Guo, Dong Zhou, Haoxiang Lin, Mao Yang, Fan Long, Chaoqiang Deng, Changshu Liu, and Lidong Zhou

USENIX ATC 2011

[PDF]

Language-based Replay via Data Flow Cut

Ming Wu, Fan Long, Xi Wang, Zhilei Xu, Haoxiang Lin, Xuezheng Liu, Zhenyu Guo, Huayang Guo, Lidong Zhou, and Zheng Zhang

FSE 2010

[PDF] [Slides]

API Hyperlinking via Structural Overlap

Fan Long, Xi Wang, and Yang Cai

ESEC-FSE 2009

[PDF] [Slides]

MODIST: Transparent Model Checking of Unmodified Distributed Systems

Junfeng Yang, Tisheng Chen, Ming Wu, Zhilei Xu, Xuezheng Liu, Haoxiang Lin, Mao Yang, Fan Long, Lintao Zhang, and Lidong Zhou

NSDI 2009

[PDF]