MySQL 9.3.0
Source Code Documentation
log_sanitizer.h
Go to the documentation of this file.
1// Copyright (c) 2022, 2025, Oracle and/or its affiliates.
2//
3// This program is free software; you can redistribute it and/or modify
4// it under the terms of the GNU General Public License, version 2.0,
5// as published by the Free Software Foundation.
6//
7// This program is designed to work with certain software (including
8// but not limited to OpenSSL) that is licensed under separate terms,
9// as designated in a particular file or component or in included license
10// documentation. The authors of MySQL hereby grant you an additional
11// permission to link the program and your derivative works with the
12// separately licensed software that they have either included with
13// the program or referenced in the documentation.
14//
15// This program is distributed in the hope that it will be useful,
16// but WITHOUT ANY WARRANTY; without even the implied warranty of
17// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
18// GNU General Public License, version 2.0, for more details.
19//
20// You should have received a copy of the GNU General Public License
21// along with this program; if not, write to the Free Software
22// Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
23
24#ifndef BINLOG_LOG_SANITIZER_H
25#define BINLOG_LOG_SANITIZER_H
26
27#include <functional>
29#include "sql/binlog.h"
30#include "sql/binlog/decompressing_event_object_istream.h" // binlog::Decompressing_event_object_istream
31#include "sql/binlog_ostream.h" // binlog::tools::Iterator
32#include "sql/binlog_reader.h" // Binlog_file_reader
33#include "sql/log_event.h" // Log_event
34#include "sql/xa.h" // XID
35
36namespace binlog {
37
38/// @brief Class used to recover binary / relay log file
39/// @details This base class is responsible for finding the last valid
40/// position of a relay log / binary log file, meaning, the position of the
41/// last finished event which occurs outside of transaction boundary.
42/// Validation starts when first reliable position has been found, i.e.:
43/// - source rotation event
44/// - source FDE
45/// - source STOP event
46/// - first finished transaction:
47/// * Query log event with: COMMIT / ROLLBACK / XA COMMIT / XA ROLLBACK /
48/// atomic DDL
49/// * XID Log event
50/// Validation ends at the end of the binlog file / relay log file or in case
51/// further reading is not possible.
52/// Binary log recovery:
53/// Binary log file always start with an FDE which is the first and valid
54/// position within a file. Binary log files are never removed by a log
55/// sanitizer.
56/// Relay log recovery:
57/// If no valid position has been found in any of the relay log files,
58/// Log sanitizer will keep all of the relay log files.
59/// In case a valid position has been found in any of the first relay log
60/// files, relay log files that do not contain a valid position outside of
61/// a transaction boundary, will be removed.
63 public:
64 /// @brief Ctor
66
67 /// @brief Dtor
68 virtual ~Log_sanitizer() = default;
69
70 /// @brief Retrieves the position of the last binlog/relay log event that
71 /// ended a transaction or position after the RLE/FDE/SE that comes from
72 /// the source
73 /// @return The position of the last binlog event that ended a transaction
74 my_off_t get_valid_pos() const;
75
76 /// @brief Retrieves the last valid source position of an event in
77 /// read from the binary log / relay log file, which may be:
78 /// - source position of the event ending a transaction
79 /// - source position written in the source RLE
80 /// @return The position of the last binlog event that ended a transaction
81 /// and indicator whether this position is valid
82 std::pair<my_off_t, bool> get_valid_source_pos() const;
83
84 /// @brief Retrieves the updated name of the binlog source file
85 /// @return Updated source file or empty string; indicator equal to true in
86 /// case filename is valid
87 std::pair<std::string, bool> get_valid_source_file() const;
88
89 /// @brief Retrieves whether or not the log was correctly processed in full.
90 /// @return true if the log processing ended with errors, false otherwise.
91 bool is_log_malformed() const;
92
93 /// @brief Retrieves the textual representation of the encontered failure, if
94 /// any.
95 /// @return the string containing the textual representation of the failure,
96 /// an empty string otherwise.
97 std::string const &get_failure_message() const;
98
99 std::string get_valid_file() const { return m_valid_file; }
100
101 /// @brief Checks whether a valid sanitized log file needs truncation of
102 /// the last, partially written transaction or events that cannot be
103 /// safely read
104 /// @return true in case log file needs to be truncated, false
105 /// otherwise
106 bool is_log_truncation_needed() const;
107
108 /// @brief Checks whether the fatal error occurred during log sanitization
109 /// (OOM / decompression error which we cannot handle)
110 /// @return true in case fatal error occurred, false otherwise
111 bool is_fatal_error() const;
112
113 protected:
114 /// @brief Function used to obtain memory key for derived classes
115 /// @returns Reference to a memory key
116 virtual PSI_memory_key &get_memory_key() const = 0;
117
118 /// @brief This function goes through the opened file and searches for
119 /// a valid position in a binary log file. It also gathers
120 /// information about XA transactions which will be used during the
121 /// binary log recovery
122 /// @param reader Log reader, must be opened
123 template <class Type_reader>
124 void process_logs(Type_reader &reader);
125
126 /// @brief This function goes iterates over the relay log files
127 /// in the 'list_of_files' container, starting from the most recent one.
128 /// It gathers information about XA transactions and performs
129 /// a small validation of the log files. Validation starts
130 /// in case a first reliable position has been found (FDE/RLE/SE from the
131 /// source or the end of a transaction), and proceeds till the end of file
132 /// or until a read error has occurred.
133 /// In case a valid position has been found within a file,
134 /// relay log files that were created after this file will be removed.
135 /// In case no valid position has been found within a file, sanitizer will
136 /// iterate over events in the previous (older) relay log file.
137 /// In case no valid position has been found in any of the files listed in
138 /// the 'list_of_files' container, relay log files won't be removed. It may
139 /// happen e.g. in case we cannot decrypt events.
140 /// @param reader Relay log file reader object
141 /// @param list_of_files The list of relay logs we know, obtained
142 /// from the relay log index
143 /// @param log MYSQL_BIN_LOG object used to manipulate relay log files
144 template <class Type_reader>
145 void process_logs(Type_reader &reader,
146 const std::list<std::string> &list_of_files,
147 MYSQL_BIN_LOG &log);
148
149 /// @brief This function will obtain the list of relay log files using the
150 /// object of MYSQL_BIN_LOG class and iterate over them to find the last
151 /// valid position within a relay log file. It will remove relay log files
152 /// that contain only parts of the last, partially written transaction
153 /// @param reader Relay log file reader object
154 /// @param log MYSQL_BIN_LOG object used to manipulate relay log files
155 template <class Type_reader>
156 void process_logs(Type_reader &reader, MYSQL_BIN_LOG &log);
157
158 /// @brief Reads and validates one log file
159 /// @param[in] filename Name of the log file to process
160 /// @param[in] reader Reference to reader able to read processed log
161 /// file
162 /// @returns true if processed log contains a valid log position outside
163 /// of transaction boundaries
164 template <class Type_reader>
165 bool process_one_log(Type_reader &reader, const std::string &filename);
166
167 /// @brief Indicates whether validation has started.
168 /// In case of relay log sanitization, we start validation
169 /// when we are sure that we are at transaction boundary and we are able
170 /// to recover source position, meaning, when we detect:
171 /// - first encountered Rotation Event, that comes from the source
172 /// - end of a transaction (Xid event, QLE containing
173 /// COMMIT/ROLLBACK/XA COMMIT/XA ROLLBACK)
174 /// - an atomic DDL transaction
175 /// Since binary logs always start at transaction boundary, when doing
176 /// a binary log recovery, we start validation right away.
177 /// By default, we are assuming that we are in the binary log recovery
178 /// procedure
180
181 /// Position of the last binlog/relay log event that ended a transaction
183 /// Position of the last binlog event that ended a transaction (source
184 /// position which corresponds to m_valid_pos)
186 /// Currently processed binlog file set in case source rotation
187 /// event is encountered
188 std::string m_valid_source_file{""};
189 /// Last log file containing finished transaction
190 std::string m_valid_file{""};
191 /// Whether or not the event being processed is within a transaction
192 bool m_in_transaction{false};
193 /// Whether or not the binary log is malformed/corrupted or error occurred
194 bool m_is_malformed{false};
195 /// Whether or not the binary log has a fatal error
196 bool m_fatal_error{false};
197 /// Textual representation of the encountered failure
198 std::string m_failure_message{""};
199 /// Memory pool to use for the XID lists
201 /// Memory pool allocator to use with the normal transaction list
203 /// Memory pool allocator to use with the XA transaction list
205 /// List of normal transactions fully written to the binary log
207 /// List of XA transactions and states that appear in the binary log
209
210 /// During binary log recovery, we check XIDs, however, during relay log
211 /// sanitization we need to skip adding of external XIDs. Relay log recovery
212 /// iterates over relay log backwards, therefore, when XA transaction
213 /// spans over separate relay log files, we may firstly encounter "XA COMMIT"
214 /// and later on "XA PREPARE".
216
217 /// Information on whether log needs to be truncated, i.e.
218 /// log is not ending at transaction boundary or we cannot read it till the
219 /// end
221
222 /// Indicator on whether a valid position has been found in the log file
223 bool m_has_valid_pos{false};
224
225 /// Indicator on whether a valid source position has been found in the log
226 /// file
228
229 /// Last opened file size
231
232 /// @brief Invoked when a `Query_log_event` is read from the binary log file
233 /// reader.
234 /// @details The underlying query string is inspected to determine if the
235 /// SQL command starts or ends a transaction. The following commands are
236 /// searched for:
237 /// - BEGIN
238 /// - COMMIT
239 /// - ROLLBACK
240 /// - DDL
241 /// - XA START
242 /// - XA COMMIT
243 /// - XA ROLLBACK
244 /// Check below for the description of the action that is taken for each.
245 /// @param ev The `Query_log_event` to process
246 void process_query_event(Query_log_event const &ev);
247
248 /// @brief Invoked when a `Xid_log_event` is read from the binary log file
249 /// reader.
250 /// @details Actions taken to process the event:
251 /// - If `m_in_transaction` flag is set to false, `m_is_malformed` is set
252 /// to true, indicating that the binary log is malformed.
253 /// - The `m_in_transaction` flag is set to false, indicating that the
254 /// event ends a transaction.
255 /// - The XID of the transaction is extracted and added to the list of
256 /// internally coordinated transactions `m_internal_xids`.
257 /// - If the XID already exists in the list, `m_is_malformed` is set to
258 /// true, indicating that the binary log is malformed.
259 /// @param ev The `Xid_log_event` to process
260 void process_xid_event(Xid_log_event const &ev);
261
262 /// @brief Invoked when a `XA_prepare_log_event` is read from the binary log
263 /// file reader.
264 /// @details Actions taken to process the event:
265 /// - If `m_in_transaction` flag is set to false, `m_is_malformed` is set
266 /// to true, indicating that the binary log is malformed.
267 /// - The `m_in_transaction` flag is set to false, indicating that the
268 /// event ends a transaction.
269 /// - The XID of the transaction is extracted and added to the list of
270 /// externally coordinated transactions `m_external_xids`, along side the
271 /// state COMMITTED if the event represents an `XA COMMIT ONE_PHASE` or
272 /// PREPARED if not.
273 /// - If the XID already exists in the list associated with a state other
274 /// than `COMMITTED` or `ROLLEDBACK`, `m_is_malformed` is set to true,
275 /// indicating that the binary log is malformed.
276 /// @param ev The `XA_prepare_log_event` to process
278
279 /// @brief Invoked when a `BEGIN` or an `XA START' is found in a
280 /// `Query_log_event`.
281 /// @details Actions taken to process the statement:
282 /// - If `m_in_transaction` flag is set to true, `m_is_malformed` is set
283 /// to true, indicating that the binary log is malformed.
284 /// - The `m_in_transaction` flag is set to true, indicating that the
285 /// event starts a transaction.
286 void process_start();
287
288 /// @brief Invoked when a `COMMIT` is found in a `Query_log_event`.
289 /// @details Actions taken to process the statement:
290 /// - If `m_in_transaction` flag is set to false, `m_is_malformed` is set
291 /// to true, indicating that the binary log is malformed.
292 /// - The `m_in_transaction` flag is set to false, indicating that the
293 /// event starts a transaction.
294 void process_commit();
295
296 /// @brief Invoked when a `ROLLBACK` is found in a `Query_log_event`.
297 /// @details Actions taken to process the statement:
298 /// - If `m_in_transaction` flag is set to false, `m_is_malformed` is set
299 /// to true, indicating that the binary log is malformed.
300 /// - The `m_in_transaction` flag is set to false, indicating that the
301 /// event starts a transaction.
302 void process_rollback();
303
304 /// @brief Invoked when a DDL is found in a `Query_log_event`.
305 /// @details Actions taken to process the statement:
306 /// - If `m_in_transaction` flag is set to true, `m_is_malformed` is set
307 /// to true, indicating that the binary log is malformed.
308 /// - The XID of the transaction is extracted and added to the list of
309 /// internally coordinated transactions `m_internal_xids`.
310 /// - If the XID already exists in the list, `m_is_malformed` is set to
311 /// true, indicating that the binary log is malformed.
312 /// @param ev The `Query_log_event` to process
313 void process_atomic_ddl(Query_log_event const &ev);
314
315 /// @brief Invoked when an `XA COMMIT` is found in a `Query_log_event`.
316 /// @details Actions taken to process the statement:
317 /// - If `m_in_transaction` flag is set to true, `m_is_malformed` is set
318 /// to true, indicating that the binary log is malformed.
319 /// - The `m_in_transaction` flag is set to false, indicating that the
320 /// event ends a transaction.
321 /// - The XID of the transaction is extracted and added to the list of
322 /// externally coordinated transactions `m_external_xids`, alongside the
323 /// state COMMITTED.
324 /// - If the XID already exists in the list associated with a state other
325 /// than `PREPARED`, `m_is_malformed` is set to true, indicating that the
326 /// binary log is malformed.
327 /// @param query The query string to process
328 void process_xa_commit(std::string const &query);
329
330 /// @brief Invoked when an `XA ROLLBACK` is found in a `Query_log_event`.
331 /// @details Actions taken to process the statement:
332 /// - If `m_in_transaction` flag is set to true, `m_is_malformed` is set
333 /// to true, indicating that the binary log is malformed.
334 /// - The `m_in_transaction` flag is set to false, indicating that the
335 /// event ends a transaction.
336 /// - The XID of the transaction is extracted and added to the list of
337 /// externally coordinated transactions `m_external_xids`, along side the
338 /// state ROLLEDBACK.
339 /// - If the XID already exists in the list associated with a state other
340 /// than `PREPARED`, `m_is_malformed` is set to true, indicating that the
341 /// binary log is malformed.
342 /// @param query The query string to process
343 void process_xa_rollback(std::string const &query);
344
345 /// @brief Parses the provided string for an XID and adds it to the externally
346 /// coordinated transactions map, along side the provided state.
347 /// @param query The query to search and retrieve the XID from
348 /// @param state The state to add to the map, along side the XID
349 void add_external_xid(std::string const &query,
351};
352
353} // namespace binlog
354
356
357#endif // BINLOG_LOG_SANITIZER_H
Contains the classes representing events occurring in the replication stream.
Definition: binlog.h:107
Mem_root_allocator is a C++ STL memory allocator based on MEM_ROOT.
Definition: mem_root_allocator.h:68
A Query event is written to the binary log whenever the database is modified on the master,...
Definition: log_event.h:1296
Similar to Xid_log_event except that.
Definition: log_event.h:1832
std::map< XID, enum_ha_recover_xa_state, std::less< XID >, Xa_state_list::allocator > list
Definition: handler.h:1266
This is the subclass of Xid_event defined in libbinlogevent, An XID event is generated for a commit o...
Definition: log_event.h:1781
Class used to recover binary / relay log file.
Definition: log_sanitizer.h:62
bool m_has_valid_source_pos
Indicator on whether a valid source position has been found in the log file.
Definition: log_sanitizer.h:227
bool m_has_valid_pos
Indicator on whether a valid position has been found in the log file.
Definition: log_sanitizer.h:223
Xid_commit_list m_internal_xids
List of normal transactions fully written to the binary log.
Definition: log_sanitizer.h:206
Mem_root_allocator< my_xid > m_set_alloc
Memory pool allocator to use with the normal transaction list.
Definition: log_sanitizer.h:202
bool process_one_log(Type_reader &reader, const std::string &filename)
Reads and validates one log file.
Definition: log_sanitizer_impl.hpp:98
virtual PSI_memory_key & get_memory_key() const =0
Function used to obtain memory key for derived classes.
void process_logs(Type_reader &reader)
This function goes through the opened file and searches for a valid position in a binary log file.
Definition: log_sanitizer_impl.hpp:87
void process_start()
Invoked when a BEGIN or an ‘XA START’ is found in a Query_log_event.
Definition: log_sanitizer.cc:148
void process_xid_event(Xid_log_event const &ev)
Invoked when a Xid_log_event is read from the binary log file reader.
Definition: log_sanitizer.cc:95
bool m_fatal_error
Whether or not the binary log has a fatal error.
Definition: log_sanitizer.h:196
bool m_skip_prepared_xids
During binary log recovery, we check XIDs, however, during relay log sanitization we need to skip add...
Definition: log_sanitizer.h:215
bool is_fatal_error() const
Checks whether the fatal error occurred during log sanitization (OOM / decompression error which we c...
Definition: log_sanitizer.cc:55
bool is_log_truncation_needed() const
Checks whether a valid sanitized log file needs truncation of the last, partially written transaction...
Definition: log_sanitizer.cc:61
bool m_validation_started
Indicates whether validation has started.
Definition: log_sanitizer.h:179
my_off_t m_last_file_size
Last opened file size.
Definition: log_sanitizer.h:230
Mem_root_allocator< std::pair< const XID, XID_STATE::xa_states > > m_map_alloc
Memory pool allocator to use with the XA transaction list.
Definition: log_sanitizer.h:204
void add_external_xid(std::string const &query, enum_ha_recover_xa_state state)
Parses the provided string for an XID and adds it to the externally coordinated transactions map,...
Definition: log_sanitizer.cc:222
MEM_ROOT m_mem_root
Memory pool to use for the XID lists.
Definition: log_sanitizer.h:200
bool is_log_malformed() const
Retrieves whether or not the log was correctly processed in full.
Definition: log_sanitizer.cc:53
std::pair< std::string, bool > get_valid_source_file() const
Retrieves the updated name of the binlog source file.
Definition: log_sanitizer.cc:48
std::string m_valid_source_file
Currently processed binlog file set in case source rotation event is encountered.
Definition: log_sanitizer.h:188
void process_atomic_ddl(Query_log_event const &ev)
Invoked when a DDL is found in a Query_log_event.
Definition: log_sanitizer.cc:175
bool m_in_transaction
Whether or not the event being processed is within a transaction.
Definition: log_sanitizer.h:192
std::pair< my_off_t, bool > get_valid_source_pos() const
Retrieves the last valid source position of an event in read from the binary log / relay log file,...
Definition: log_sanitizer.cc:44
bool m_is_log_truncation_needed
Information on whether log needs to be truncated, i.e.
Definition: log_sanitizer.h:220
bool m_is_malformed
Whether or not the binary log is malformed/corrupted or error occurred.
Definition: log_sanitizer.h:194
my_off_t m_valid_source_pos
Position of the last binlog event that ended a transaction (source position which corresponds to m_va...
Definition: log_sanitizer.h:185
void process_xa_rollback(std::string const &query)
Invoked when an XA ROLLBACK is found in a Query_log_event.
Definition: log_sanitizer.cc:206
virtual ~Log_sanitizer()=default
Dtor.
Xa_state_list::list m_external_xids
List of XA transactions and states that appear in the binary log.
Definition: log_sanitizer.h:208
void process_xa_prepare_event(XA_prepare_log_event const &ev)
Invoked when a XA_prepare_log_event is read from the binary log file reader.
Definition: log_sanitizer.cc:114
std::string get_valid_file() const
Definition: log_sanitizer.h:99
std::string m_valid_file
Last log file containing finished transaction.
Definition: log_sanitizer.h:190
void process_query_event(Query_log_event const &ev)
Invoked when a Query_log_event is read from the binary log file reader.
Definition: log_sanitizer.cc:65
my_off_t get_valid_pos() const
Retrieves the position of the last binlog/relay log event that ended a transaction or position after ...
Definition: log_sanitizer.cc:42
void process_xa_commit(std::string const &query)
Invoked when an XA COMMIT is found in a Query_log_event.
Definition: log_sanitizer.cc:190
std::string const & get_failure_message() const
Retrieves the textual representation of the encontered failure, if any.
Definition: log_sanitizer.cc:57
Log_sanitizer()
Ctor.
Definition: log_sanitizer.cc:34
my_off_t m_valid_pos
Position of the last binlog/relay log event that ended a transaction.
Definition: log_sanitizer.h:182
std::string m_failure_message
Textual representation of the encountered failure.
Definition: log_sanitizer.h:198
void process_commit()
Invoked when a COMMIT is found in a Query_log_event.
Definition: log_sanitizer.cc:157
void process_rollback()
Invoked when a ROLLBACK is found in a Query_log_event.
Definition: log_sanitizer.cc:166
Stream class that yields Log_event objects, including events contained in Transaction_payload_log_eve...
unsigned int PSI_memory_key
Instrumented memory key.
Definition: psi_memory_bits.h:49
Binary log event definitions.
ulonglong my_off_t
Definition: my_inttypes.h:72
static char * query
Definition: myisam_ftdump.cc:47
Definition: pfs.cc:38
const char * filename
Definition: pfs_example_component_population.cc:67
enum_ha_recover_xa_state
Enumeration of possible states for externally coordinated transactions (XA).
Definition: handler.h:1241
std::unordered_set< my_xid, std::hash< my_xid >, std::equal_to< my_xid >, Mem_root_allocator< my_xid > > Xid_commit_list
Single occurrence set of XIDs of internally coordinated transactions found as been committed in the t...
Definition: handler.h:1255
The MEM_ROOT is a simple arena, where allocations are carved out of larger blocks.
Definition: my_alloc.h:83