1) Setup the big data infrastructure, design the efficient workflow, leading the software development, help enterprises design the solution to quickly migrate their data from their variety legacy systems to big data platform
2) Lead designing and developing the key components for Data Fabric projects, like Data migrator, DLM etc
3) Use the technologies like Spark, Scala etc to create the high performance and robust big data job
4) Decoupled the big data and business logic, concise the code, lead developers to focus on the business logic to high efficiently develop and delivery complex projects in a short time period
5) Designed the robust and efficient tools to migrate data in different environment like data lake, mongoDB, S3, Cassandra, RDBMS etc
6) Established strategies to improve efficiency of data processing, like simplify the execution plan, optimize the BIG O etc during data processing to produce very robust, reliable and high-performance big data jobs.
7) Designed the Data Lifecycle Management Service, the key component of data fabric, it is robust, easy to maintain and extend
8) Successfully applied the data security in Hadoop system. Encrypt and decrypt data by voltage in application level.
9) Reviewed, designed and successfully delivered the MUREX-DARWIN project for Counterparty Credit Risk by Spark, Scala and Hadoop
10) Optimized the spark job, simplified the workflow, made the spark job to be more concise, efficient and robust in the shared Hadoop cluster
11) Led developers to create the concise spark code, it would be more easier to maintain and extend in the future development
12) Used Scala, Spark, Flink and Kafka Stream Processor API to design the FRTB MurexEOD AND Intraday project for Capital Markets and lead team to develop and delivery end to end
13) Led the design and implementation of Position Materiality and Exception Report for Capital Markets in spark, processes 970M records in around 6 mins in hadoop yarn cluster.
14) Developed the query engine using spark job, decouple the business logics and technologies, like BA can directly create generic queries which can be executed in spark context with high performance.
15) Helped team onboard spark in k8s, spark process data in k8s and read/write from/to AWS and Azure ADLS Gen 2