Introducing Outflux: a smart way out of InfluxDB
转自:https://blog.timescale.com/migrate-outflux-a-smart-way-out-of-influxdb/
Migrate your workload from InfluxDB to TimescaleDB with just a single command
Users are often asking us how to migrate off of InfluxDB because they want the reliability and flexibility of TimescaleDB. To make it easier for them, we are introducing a new migration tool called Outflux.
Designed to help users seamlessly migrate from InfluxDB to TimescaleDB, Outflux is built in a modular fashion, enabling users to either migrate schema or data or both directly into TimescaleDB. It’s easy to use, configurable, and most importantly, it’s fast.
In this post, we will first cover the motivations behind creating Outflux, then dive deeper into how we built it, how it works, and how to get started.
[Or jump straight to the tutorial.]
Why we created Outflux
We give props to InfluxData for being an early entry in the time-series database market, but as with any technology, one-size does not fit all. Aside from receiving feedback from our users, we also saw a visible market gap between the services InfluxDB provides, and what users actually need (which led to the birth of TimescaleDB).
Early on we recognized that working with IoT data was a whole new ballgame than the DevOps workloads InfluxDB is designed for. The ability to combine your time-series data with your relational data (and often, geo-spatial data) is becoming increasingly important for organizations. TimescaleDB is optimized for these types of workloads and gives organizations the option to scale from the cloud to the edge.
We’d also be remiss not to touch on the fact that TimescaleDB is built on the SQL open-standard. This allows organizations to unify under a single query language that is well known among developers across the globe. InfluxData initially decided to create their own SQL-like query language called InfluxQL, but that had quirks and gotchas that tripped up developers. Then they created a new language, Flux, to solve the issues associated with InfluxQL. Now their users are faced with learning a whole new language which creates a split ecosystem and even more confusion. Not ideal. (More on SQL vs. Flux here.)
We’ve benchmarked TimescaleDB against InfluxDB and encourage you to review the results. You might just find that InfluxDB is satisfactory for your needs, or you might decide that it’s time for an upgrade and switch to TimescaleDB.
How we built Outflux & how it works
After some careful evaluation and research, we discovered that several API clients exist for InfluxDB for a variety of programming languages including Python and Go. The Go client is maintained by InfluxData themselves, and using this API client allowed us to tailor the tool to our needs.
We found that we could control:
- The amount of data being selected from the InfluxDB server
- Which data (configurably) is exported
- The input format for TimescaleDB (and additional transformations before inserting)
- The concurrency level
From here, we built outflux
as a series of libraries connected in one CLI that selects data from InfluxDB using their HTTP API, discovers and transforms the data schema, and imports it into TimescaleDB concurrently.
Underneath the covers, Outflux implements an Extraction Pipeline that has three stages:
- Extractor: where Outflux queries the input database (InfluxDB) by selecting data specified by the user and converting it into an intermediate format
- Transformation Chain: where configurable changes can be made to the data (i.e. castings, filters, column generation) while preserving the intermediate format
- Injector: where the specific code for the receiving database resides (Outflux knows how to transform the intermediate/neutral format and insert it)
Each of the components work independently of each other in a separate coroutine/thread. Multiple pipelines can be used to export each measurement in a concurrent fashion. The Extraction Pipelines are spawned, joined and managed by the main Outflux component. The figure below illustrates the data flow once the Extraction Pipeline is in place.
Getting started
Now that you know more about Outflux’s internal framework, you are ready to get started! Outflux is an open-source tool and the code is available on GitHub in a public repository.
Using the tool is easy and involves just a simple command: `migrate`. Outflux manages the schema discovery, validation, and creation. It also handles exporting and importing data from an input database to an output database.
Before you begin setting up Outflux you need to ensure you have 1) a running instance of InfluxDB at a known location and a means to connect to it and 2) TimescaleDB installed and a means to connect to it.
If all the pre-requirements are met, you can begin installing Outflux by...
- Visiting the releases section of the repository
- Downloading the latest compressed tarball for your platform
- Extracting it to a preferred location
If you navigate to where you extracted the archive and execute:
$ ./outflux --help
Outflux offers the capabilities to migrate an InfluxDB database, or specific measurements to TimescaleDB. It can also allow a user to transfer only the schema of a database or measurement to TimescaleDB
Usage:
outflux [command]
Available Commands:
help Help about any command
migrate Migrate the schema and data from InfluxDB measurements into TimescaleDB hypertables
schema-transfer Discover the schema of measurements and validate or prepare a TimescaleDB hyper-table with the discovered schema
You will see the help output for Outflux, a brief explanation of what it can do, the usage, and available commands.
For instructions on how to set up Outflux from source, you can visit the README. For step-by-step instructions on how to get started, please read our tutorial.
Note: Outflux currently only supports bulk migrations. Live, continuous migrations from InfluxDB will be supported with other upcoming solutions.
Next steps
Are you ready to migrate off of InfluxDB and upgrade to TimescaleDB? Yes? We thought so! As mentioned above, you can follow the tutorial for in-depth instructions. Be sure to check out the Outflux page for more information and ways to contact us.
If you are new to TimescaleDB, follow these installation instructions. (Note: If you are just getting started, we encourage you to check out our features matrix to see which version of TimescaleDB is best for you.)
FAQs
How much does Outflux cost?
Outflux is free to use! TimescaleDB is also free to use, but if you are looking to upgrade to TimescaleDB Enterprise, we offer different pricing options based on your needs.
Can you tell me more about the differences between TimescaleDB vs InfluxDB?
Of course! We’ve benchmarked TimescaleDB vs InfluxDB and you can read the results on our whitepaper.
What is the difference between SQL and Flux?
We are glad you asked! Read this post to learn all about the two query languages.
Does Outflux do migrations in a live fashion?
No, Outflux currently only supports bulk migrations, so active inserts into InfluxDB after Outflux is used will not migrate over. Continuous migrations from InfluxDB will be supported with other upcoming solutions.
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析
· 没有源码,如何修改代码逻辑?
· 全程不用写代码,我用AI程序员写了一个飞机大战
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· .NET10 - 预览版1新功能体验(一)
2018-04-12 elixir grpc 试用
2018-04-12 elixir 调用erlang 代码
2018-04-12 Elixir's keyword lists as option parameters
2015-04-12 BigPipe 了解