Releases: manticoresoftware/manticoresearch
Manticore Search 6.3.0
Version 6.3.0
Released: May 23rd 2024
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
📢📢📢Announcement blog post📢📢📢
Major changes
- Issue #839 Implemented float_vector data type; implemented vector search.
- Issue #1673 INNER/LEFT JOIN (beta stage).
- Issue #1744 Implemented autodetection of date formats for timestamp fields.
- Issue #1720 Changed Manticore Search license from GPLv2-or-later to GPLv3-or-later.
- Commit 7a55 Running Manticore in Windows now requires Docker to run Buddy.
- Issue #1541 Added a REGEX full-text operator.
- Issue #2091 Ubuntu Noble 24.04 support.
- Commit 514d Revamp of time operations for better performance and new date/time functions:
- CURDATE() - Returns current date in local timezone
- QUARTER() - Returns the integer quarter of the year from a timestamp argument
- DAYNAME() - Returns the weekday name for a given timestamp argument
- MONTHNAME() - Returns the name of the month for a given timestamp argument
- DAYOFWEEK() - Returns the integer weekday index for a given timestamp argument
- DAYOFYEAR() - Returns the integer day of the year for a given timestamp argument
- YEARWEEK() - Returns the integer year and the day code of the first day of current week for a given timestamp argument
- DATEDIFF() - Returns the number of days between two given timestamps
- DATE() - Formats the date part from a timestamp argument
- TIME() - Formats the time part from a timestamp argument
- timezone - Timezone used by date/time-related functions.
- Commit 30e7 Added range, histogram, date_range, and date_histogram aggregates to the HTTP interface and similar expressions into SQL.
Minor changes
- Issue #1285 Support of Filebeat versions 8.10 - 8.11.
- Issue #1771 ALTER TABLE ... type='distributed'.
- Issue #1788 Added the ability to copy tables using the CREATE TABLE ... LIKE ... WITH DATA SQL statement.
- Issue #2072 Optimized the table compacting algorithm: Previously, both manual OPTIMIZE and automatic auto_optimize processes would first merge chunks to ensure the count did not exceed the limit, and then expunge deleted documents from all other chunks containing deleted documents. This approach was sometimes too resource-intensive and has been disabled. Now, chunk merging occurs solely according to the progressive_merge setting. However, chunks with a high number of deleted documents are more likely to be merged.
- Commit ce6c Added protection against loading a secondary index of a newer version.
- Issue #1417 Partial replace via REPLACE INTO ... SET.
- Commit 7c16 Updated default merge buffer sizes:
.spa
(scalar attrs): 256KB -> 8MB;.spb
(blob attrs): 256KB -> 8MB;.spc
(columnar attrs): 1MB, no change;.spds
(docstore): 256KB -> 8MB;.spidx
(secondary indexes): 256KB buffer -> 128MB memory limit;.spi
(dictionary): 256KB -> 16MB;.spd
(doclists): 8MB, no change;.spp
(hitlists): 8MB, no change;.spe
(skiplists): 256KB -> 8MB. - Issue #1859 Added composite aggregation via JSON.
- Commit 216b Disabled PCRE.JIT due to issues with some regex patterns and no significant time benefit.
- Commit 55cd Added support for vanilla Galera v.3 (api v25) (
libgalera_smm.so
from MySQL 5.x). - Commit 86f9 Changed metric suffix from
_rate
to_rps
. - Commit c0c1 Improved docs about balancer HA support.
- Commit d1d2 Changed
index
totable
in error messages; fixed bison parser error message fixup. - Commit fd26 Support
manticore.tbl
as table name. - Issue #1105 Support for running indexer via systemd (docs). ❤️ Thank you, @subnix for the PR.
- Issue #1294 Secondary indexes support in GEODIST().
- Issue #1394 Simplified SHOW THREADS.
- Issue #1424 Added support for the default values (
agent_connect_timeout
andagent_query_timeout
) forcreate distributed table
statement. - Issue #1442 Added expansion_limit search query option that overrides
searchd.expansion_limit
. - Issue #1448 Implemented ALTER TABLE for int->bigint conversion.
- Issue #146 Meta information in MySQL response.
- Issue #1494 SHOW VERSION.
- Issue #1582 Support of deleting documents by id array via JSON.
- Issue #1589 Improve error "unsupported value type".
- Issue #1634 Added Buddy ve...
Manticore Search 6.2.12
Manticore Search 6.2.12
Released: Aug 23rd 2023
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
Version 6.2.12 continues the 6.2 series and addresses issues discovered after the release of 6.2.0.
Bugfixes
- ❗Issue #1351 "Manticore 6.2.0 doesn't start via systemctl on Centos 7": Modified
TimeoutStartSec
frominfinity
to0
for better compatibility with Centos 7. - ❗Issue #1364 "Crash after upgrading from 6.0.4 to 6.2.0": Added replay functionality for empty binlog files from older binlog versions.
- PR #1334 "fix typo in searchdreplication.cpp": Corrected a typo in
searchdreplication.cpp
: beggining -> beginning. - Issue #1337 "Manticore 6.2.0 WARNING: conn (local)(12), sock=8088: bailing on failed MySQL header, AsyncNetInputBuffer_c::AppendData: error 11 (Resource temporarily unavailable) return -1": Lowered the verbosity level of the MySQL interface warning about the header to logdebugv.
- Issue #1355 "join cluster hangs when node_address can't be resolved": Improved replication retry when certain nodes are unreachable, and their name resolution fails. This should resolve issues in Kubernetes and Docker nodes related to replication. Enhanced the error message for replication start failures and made updates to test model 376. Additionally, provided a clear error message for name resolution failures.
- Issue #1361 "No lower case mapping for "Ø" in charset non_cjk": Adjusted the mapping for the 'Ø' character.
- Issue #1365 "searchd leaves binlog.meta and binlog.001 after clean stop": Ensured that the last empty binlog file is removed properly.
- Commit 0871: Fixed the
Thd_t
build issue on Windows related to atomic copy restrictions. - Commit 1cc0: Addressed an issue with FT CBO vs
ColumnarScan
. - Commit c6bf: Made corrections to test 376 and added a substitution for the
AF_INET
error in the test. - Commit cbc3: Resolved a deadlock issue during replication when updating blob attributes versus replacing documents. Also removed the rlock of the index during commit because it's already locked at a more basic level.
Minor changes
- Commit 4f91 Updated info on
/bulk
endpoints in the manual.
MCL
- Support of Manticore Columnar Library v2.2.4
Manticore Search 6.2.0
Manticore Search 6.2.0
Released: Aug 4th 2023
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
Major changes
- The query optimizer has been enhanced to support full-text queries, significantly improving search efficiency and performance.
- Integrations with:
- mysqldump - to make logical backups using
mysqldump
- Apache Superset and Grafana to visualize data stored in Manticore
- HeidiSQL and DBForge for easier development with Manticore
- mysqldump - to make logical backups using
- We've started using GitHub workflows, making it simpler for contributors to utilize the same Continuous Integration (CI) process that the core team applies when preparing packages. All jobs can be run on GitHub-hosted runners, which facilitates seamless testing of changes in your fork of Manticore Search.
- We've started using CLT to test complex scenarios. For example, we're now able to ensure that a package built after a commit can be properly installed across all supported Linux operating systems. The Command Line Tester (CLT) provides a user-friendly way to record tests in an interactive mode and to effortlessly replay them.
- Significant performance improvement in count distinct operation by employing a combination of hash tables and HyperLogLog.
- Enabled multithreaded execution of queries containing secondary indexes, with the number of threads limited to the count of physical CPU cores. This should considerably improve the query execution speed.
pseudo_sharding
has been adjusted to be limited to the number of free threads. This update considerably enhances the throughput performance.- Users now have the option to specify the default attribute storage engine via the configuration settings, providing better customization to match specific workload requirements.
- Support for Manticore Columnar Library 2.2.0 with numerous bug fixes and improvements in Secondary indexes.
Minor changes
- Buddy #153: The /pq HTTP endpoint now serves as an alias for the
/json/pq
HTTP endpoint. - Commit 0bf1: We've ensured multi-byte compatibility for
upper()
andlower()
. - Commit 2bb9: Instead of scanning the index for
count(*)
queries, a precalculated value is now returned. - Commit 3c84: It's now possible to use
SELECT
for making arbitrary calculations and displaying@@sysvars
. Unlike before, you are no longer limited to just one calculation. Therefore, queries likeselect user(), database(), @@version_comment, version(), 1+1 as a limit 10
will return all the columns. Note that the optional 'limit' will always be ignored. - Commit 6aca: Implemented the
CREATE DATABASE
stub query. - Commit 9dc1: When executing
ALTER TABLE table REBUILD SECONDARY
, secondary indexes are now always rebuilt, even if attributes weren't updated. - Commit 46ed: Sorters utilizing precalculated data are now identified before using CBO to avoid unnecessary CBO calculations.
- Commit 102a: Implementing mocked and utilizing of the full-text expression stack to prevent daemon crashes.
- Commit 979f: A speedy code path has been added for match cloning code for matches that don't use string/mvas/json attributes.
- Commit a073: Added support for the
SELECT DATABASE()
command. However, it will always returnManticore
. This addition is crucial for integrations with various MySQL tools. - Commit bc04: Modified the response format of the /cli endpoint, and added the
/cli_json
endpoint to function as the previous/cli
. - Commit d70b: The
thread_stack
can now be altered during runtime using theSET
statement. Both session-local and daemon-wide variants are available. Current values can be accessed in theshow variables
output. - Commit d96e: Code has been integrated into CBO to more accurately estimate the complexity of executing filters over string attributes.
- Commit e77d: The DocidIndex cost calculation has been improved, enhancing overall performance.
- Commit f3ae: Load metrics, similar to 'uptime' on Linux, are now visible in the
SHOW STATUS
command. - Commit f3cc: The field and attribute order for
DESC
andSHOW CREATE TABLE
now match that ofSELECT * FROM
. - Commit f3d2: Different internal parsers now provide their internal mnemonic code (e.g.,
P01
) during various errors. This enhancement aids in identifying which parser caused an error and also obscures non-essential internal details. - Issue #271 "Sometimes CALL SUGGEST does not suggest a correction of a single letter typo": Improved SUGGEST/QSUGGEST behaviour for short words: added the option
sentence
to show the entire sentence - Issue #696 "Percolate index does not search properly by exact phrase query when stemming enabled": The percolate query has been modified to handle an exact term modifier, improving search functionality.
- Issue #829 "DATE FORMATTING methods": added the date_format() select list expression, which exposes the
strftime()
function. - Issue #961 "Sorting buckets via HTTP JSON API": introduced an optional sort property for each bucket of aggregates in the HTTP interface.
- Issue #1062 "Improve error logging of JSON insert api failure - "unsupported value type"": The
/bulk
endpoint reports information regarding the number of processed and non-processed strings (documents) in case of an error. - Issue #1070 "CBO hints don't support multiple attributes": Enabled index hints to handle multiple attributes.
- Issue #1106 "Add tags to http search query": Tags have been added to HTTP PQ responses.
- Issue #1301 "buddy should not create table in parallel": Resolved an issue that was causing failures from parallel CREATE TABLE operations. Now, only one
CREATE TABLE
operation can run at a time. - Issue #1303 "add support of @ to column names".
- Issue #1316 "Queries on taxi dataset are slow with ps=1": The CBO logic has been refined, and the default histogram resolution has been set to 8k for better accuracy on attributes with randomly distributed values.
- Issue #1317 "Fix CBO vs fulltext on hn dataset": Enhanced logic has been implemented for determining when to use bitmap iterator intersection and when to use a priority queue.
- Issue #1318 "columnar: change iterator interface to single-call" : Columnar iterators now use a single
Get
call, replacing the previous two-stepAdvanceTo
+Get
calls to retrieve a value. - Issue #1319 "Aggregate calc speedup (remove CheckReplaceEntry?)": The
CheckReplaceEntry
call was removed from...
Manticore Search 6.0.4
Manticore Search 6.0.4
Released: Mar 15 2023
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
New features
- Improved integration with Logstash, Beats etc. including:
- Support for Logstash versions >= 7.13.
- Auto-schema support.
- Added handling of bulk requests in Elasticsearch-like format.
- Buddy commit ce90 Log Buddy version on Manticore start.
Bugfixes
- Issue #588, Issue #942 Fixed bad character at the search meta and call keywords for bigram index.
- Issue #1027 Lowercase HTTP headers are rejected.
- ❗Issue #1039 Fixed memory leak at daemon on reading output of the Buddy console.
- Issue #1056 Fixed unexpected behavior of question mark.
- Issue #1064 - Fixed race condition in tokenizer lowercase tables causing a crash.
- Commit 59bb Fixed bulk writes processing in the JSON interface for documents with id explicitly set to null.
- Commit 7b6b Fixed term statistics in CALL KEYWORDS for multiple same terms.
- Commit f381 Default config is now created by Windows installer; paths are no longer substituted in runtime.
- Commit 6940, Commit cc5a Fixed replication issues for cluster with nodes in multiple networks.
- Commit 4972 Fixed
/pq
HTTP endpoint to be an alias of the/json/pq
HTTP endpoint. - Commit 3b53 Fixed daemon crash on Buddy restart.
- Buddy commit fba9 Display original error on invalid request received.
- Buddy commit db95 Allow spaces in backup path and add some magic to regexp to support single quotes also.
Manticore Search 6.0.2
Manticore Search 6.0.2
Released: Feb 10 2023
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
Bugfixes
- Issue #1024 crash 2 Crash / Segmentation Fault on Facet search with larger number of results
- ❗Issue #1029 - WARNING: Compiled-in value KNOWN_CREATE_SIZE (16) is less than measured (208). Consider to fix the value!
- ❗Issue #1032 - Manticore 6.0.0 plain index crashes
- ❗Issue #1033 - multiple distributed lost on daemon restart
Manticore Search 6.0.0
Manticore Search 6.0.0
Released: Feb 7 2023
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
Starting with this release, Manticore Search comes with Manticore Buddy, a sidecar daemon written in PHP that handles high-level functionality that does not require super low latency or high throughput. Manticore Buddy operates behind the scenes, and you may not even realize it is running. Although it is invisible to the end user, it was a significant challenge to make Manticore Buddy easily installable and compatible with the main C++-based daemon. This major change will allow the team to develop a wide range of new high-level features, such as shards orchestration, access control and authentication, and various integrations like mysqldump, DBeaver, Grafana mysql connector. For now it already handles SHOW QUERIES, BACKUP and Auto schema.
This release also includes more than 130 bug fixes and numerous features, many of which can be considered major.
Major Changes
- 🔬 Experimental: you can now execute Elasticsearch-compatible insert and replace JSON queries which enables using Manticore with tools like Logstash (version < 7.13), Filebeat and other tools from the Beats family. Enabled by default. You can disable it using
SET GLOBAL ES_COMPAT=off
. - Support for Manticore Columnar Library 2.0.0 with numerous fixes and improvements in Secondary indexes.
⚠️ BREAKING CHANGE: Secondary indexes are ON by default as of this release. Make sure you do ALTER TABLE table_name REBUILD SECONDARY if you are upgrading from Manticore 5. See below for more details. - Commit c436 Auto-schema: you can now skip creating a table, just insert the first document and Manticore will create the table automatically based on its fields. Read more about this in detail here. You can turn it on/off using searchd.auto_schema.
- Vast revamp of cost-based optimizer which lowers query response time in many cases.
- Issue #1008 Parallelization performance estimate in CBO.
- Issue #1014 CBO is now aware of secondary indexes and can act smarter.
- Commit cef9 Encoding stats of columnar tables/fields are now stored in the meta data to help CBO make smarter decisions.
- Commit 2b95 Added CBO hints for fine-tuning its behaviour.
- Telemetry: we are excited to announce the addition of telemetry in this release. This feature allows us to collect anonymous and depersonalized metrics that will help us improve the performance and user experience of our product. Rest assured, all data collected is completely anonymous and will not be linked to any personal information. This feature can be easily turned off in the settings if desired.
- Commit 5aaf ALTER TABLE table_name REBUILD SECONDARY to rebuild secondary indexes whenever you want, for example:
- when you migrate from Manticore 5 to the newer version,
- when you did UPDATE (i.e. in-place update, not replace) of an attribute in the index
- Issue #821 New tool
manticore-backup
for backing up and restoring Manticore instance - SQL command BACKUP to do backups from inside Manticore.
- SQL command SHOW QUERIES as an easy way to see running queries rather than threads.
- Issue #551 SQL command
KILL
to kill a long-runningSELECT
. - Dynamic
max_matches
for aggregation queries to increase accuracy and lower response time.
Minor changes
-
Issue #822 SQL commands FREEZE/UNFREEZE to prepare a real-time/plain table for a backup.
-
Commit c470 New settings
accurate_aggregation
andmax_matches_increase_threshold
for controlled aggregation accuracy. -
Issue #718 Support for signed negative 64-bit IDs. Note, you still can't use IDs > 2^63, but you can now use ids in the range of from -2^63 to 0.
-
As we recently added support for secondary indexes, things became confusing as "index" could refer to a secondary index, a full-text index, or a plain/real-time
index
. To reduce confusion, we are renaming the latter to "table". The following SQL/command line commands are affected by this change. Their old versions are deprecated, but still functional:index <table name>
=>table <table name>
,searchd -i / --index
=>searchd -t / --table
,SHOW INDEX STATUS
=>SHOW TABLE STATUS
,SHOW INDEX SETTINGS
=>SHOW TABLE SETTINGS
,FLUSH RTINDEX
=>FLUSH TABLE
,OPTIMIZE INDEX
=>OPTIMIZE TABLE
,ATTACH TABLE plain TO RTINDEX rt
=>ATTACH TABLE plain TO TABLE rt
,RELOAD INDEX
=>RELOAD TABLE
,RELOAD INDEXES
=>RELOAD TABLES
.
We are not planning to make the old forms obsolete, but to ensure compatibility with the documentation, we recommend changing the names in your application. What will be changed in a future release is the "index" to "table" rename in the output of various SQL and JSON commands.
-
Queries with stateful UDFs are now forced to be executed in a single thread.
-
Issue #1011 Refactoring of all related to time scheduling as a prerequisite for parallel chunks merging.
-
⚠️ BREAKING CHANGE: Columnar storage format has been changed. You need to rebuild those tables that have columnar attributes. -
⚠️ BREAKING CHANGE: Secondary indexes file format has been changed, so if you are using secondary indexes for searching and havesearchd.secondary_indexes = 1
in your configuration file, be aware that the new Manticore version will skip loading the tables that have secondary indexes. It's recommended to:- Before you upgrade change
searchd.secondary_indexes
to 0 in the configuration file. - Run the instance. Manticore will load up the tables with a warning.
- Run
ALTER TABLE <table name> REBUILD SECONDARY
for each index to rebuild secondary indexes.
If you are running a replication cluster, you'll need to run
ALTER TABLE <table name> REBUILD SECONDARY
on all the nodes or follow this instruction with just change: run theALTER .. REBUILD SECONDARY
instead of theOPTIMIZE
. - Before you upgrade change
-
⚠️ BREAKING CHANGE: The binlog version has been updated, so any binlogs from previous versions will not be replayed. It is important to ensure that Manticore Search is stopped cleanly during the upgrade process. This means that there should be no binlog files in/var/lib/manticore/binlog/
except forbinlog.meta
after stopping the previous instance. -
Issue #849
SHOW SETTINGS
: helper command for manticore-backup. -
Issue #1007 SET GLOBAL CPUSTATS=1/0 turns on/off cpu time tracking; SHOW THREADS now doesn'...
Manticore Search 5.0.2
Manticore Search 5.0.2, May 30th 2022
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
- ❗Issue #791 - wrong stack size could cause a crash.
Manticore Search 5.0.0
Manticore Search 5.0.0, May 18th 2022
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
Release blogpost https://manticoresearch.com/blog/manticore-search-5-0-0/
Major new features
- 🔬 Support for Manticore Columnar Library 1.15.2, which enables Secondary indexes beta version. Building secondary indexes is on by default for plain and real-time columnar and row-wise indexes (if Manticore Columnar Library is in use), but to enable it for searching you need to set
secondary_indexes = 1
either in your configuration file or using SET GLOBAL. The new functionality is supported in all operating systems except old Debian Stretch and Ubuntu Xenial. - Read-only mode: you can now specify listeners that process only read queries discarding any writes.
- New /cli endpoint for running SQL queries over HTTP even easier.
- Faster bulk INSERT/REPLACE/DELETE via JSON over HTTP: previously you could provide multiple write commands via HTTP JSON protocol, but they were processed one by one, now they are handled as a single transaction.
- #720 Nested filters support in JSON protocol. Previously you couldn't code things like
a=1 and (b=2 or c=3)
in JSON:must
(AND),should
(OR) andmust_not
(NOT) worked only on the highest level. Now they can be nested. - Support for Chunked transfer encoding in HTTP protocol. You can now use chunked transfer in your application to transfer large batches with lower resource consumption (since you don't need to calculate
Content-Length
). On the server's side Manticore now always processes incoming HTTP data in streaming fashion without waiting for the whole batch to be transferred as previously, which:- decreases peak RAM consumption, which lowers a chance of OOM
- decreases response time (our tests showed 11% decrease for processing a 100MB batch)
- lets you overcome max_packet_size and transfer batches much larger than the largest allowed value of
max_packet_size
(128MB), e.g. 1GB at once.
- #719 HTTP interface support of
100 Continue
: now you can transfer large batches fromcurl
(including curl libraries used by various programming languages) which by default doesExpect: 100-continue
and waits some time before actually sending the batch. Previously you had to addExpect:
header, now it's not needed. -
- Pseudo sharding is enabled by default.
- Having at least one full-text field in a real-time/plain index is not mandatory anymore. You can now use Manticore even in cases not having anything to do with full-text search.
- Fast fetching for attributes backed by Manticore Columnar Library: queries like
select * from <columnar table>
are now much faster than previously, especially if there are many fields in the schema. ⚠️ BREAKING CHANGE: Implicit cutoff. Manticore now doesn't spend time and resources processing data you don't need in the result set which will be returned. The downside is that it affectstotal_found
in SHOW META and hits.total in JSON output. It is now only accurate in case you seetotal_relation: eq
whiletotal_relation: gte
means the actual number of matching documents is greater than thetotal_found
value you've got. To retain the previous behaviour you can use search optioncutoff=0
, which makestotal_relation
alwayseq
.⚠️ BREAKING CHANGE: All full-text fields are now stored by default in plain indexes. You need to usestored_fields =
(empty value) to make all fields non-stored (i.e. revert to the previous behaviour).- #715 HTTP JSON supports search options.
Minor changes
⚠️ BREAKING CHANGE: Index meta file format change. Previously meta files (.meta
,.sph
) were in binary format, now it's just json. The new Manticore version will convert older indexes automatically, but:- you can get warning like
WARNING: ... syntax error, unexpected TOK_IDENT
- you won't be able to run the index with previous Manticore versions, make sure you have a backup
- you can get warning like
⚠️ BREAKING CHANGE: Session state support with help of HTTP keep-alive. This makes HTTP stateful when the client supports it too. For example, using the new /cli endpoint and HTTP keep-alive (which is on by default in all browsers) you can callSHOW META
afterSELECT
and it will work the same way it works via mysql. Note, previouslyConnection: keep-alive
HTTP header was supported too, but it only caused reusing the same connection. Since this version it also makes the session stateful.- You can now specify
columnar_attrs = *
to define all your attributes as columnar in the plain mode which is useful in case the list is long. - Faster replication SST
⚠️ BREAKING CHANGE: Replication protocol has been changed. If you are running a replication cluster, then when upgrading to Manticore 5 you need to:- stop all your nodes first cleanly
- and then start the node which was stopped last with
--new-cluster
(run toolmanticore_new_cluster
in Linux). - read about restarting a cluster for more details.
- Replication improvements:
- Faster SST
- Noise resistance which can help in case of unstable network between replication nodes
- Improved logging
- Security improvement: Manticore now listens on
127.0.0.1
instead of0.0.0.0
in case nolisten
at all is specified in config. Even though in the default configuration which is shipped with Manticore Search thelisten
setting is specified and it's not typical to have a configuration with nolisten
at all, it's still possible. Previously Manticore would listen on0.0.0.0
which is not secure, now it listens on127.0.0.1
which is usually not exposed to the Internet. - Faster aggregation over columnar attributes.
- Increased
AVG()
accuracy: previously Manticore usedfloat
internally for aggregations, now it usesdouble
which increases the accuracy significantly. - Improved support for JDBC MySQL driver.
DEBUG malloc_stats
support for jemalloc.- optimize_cutoff is now available as a per-table setting which can be set when you CREATE or ALTER a table.
⚠️ BREAKING CHANGE: query_log_format is nowsphinxql
by default. If you are used toplain
format you need to addquery_log_format = plain
to your configuration file.- Significant memory consumption improvements: Manticore consumes significantly less RAM now in case of long and intensive insert/replace/optimize workload in case stored fields are used.
- shutdown_timeout default value was increased from 3 seconds to 60 seconds.
- Commit ffd0499d Support for Java mysql connector >= 6.0.3: in Java mysql connection 6.0.3 they changed the way they connect to mysql which broke compatibility with Manticore. The new behaviour is now supported.
- Commit 1da6dbec disabled saving a new disk chunk on loading an index (e.g. on searchd startup).
- Issue #746 Support for glibc >= 2.34.
- Issue #784 count 'VIP' connections separately from usual (non-VIP). Previously VIP connections were counted towards the
max_connections
limit, which could cause "maxed out" error for non-VIP connections. Now VIP connections are not counted towards the limit. Current number of VIP connections can be also seen inSHOW STATUS
andstatus
. - ID can now be specified explicitly.
⚠️ Other minor breaking changes
⚠️ BM25F formula has been slightly updated to improve search relevance. This only affects search results in case you use function [BM25F()](https://manual.manticoresearch.com/Functions/Sea...
Manticore Search 4.2.0
Manticore Search 4.2.0, Dec 23rd 2021
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
Major new features
- Pseudo-sharding support for real-time indexes and full-text queries. In previous release we added limited pseudo sharding support. Starting from this version you can get all benefits of the pseudo sharding and your multi-core processor by just enabling searchd.pseudo_sharding. The coolest thing is that you don't need to do anything with your indexes or queries for that, just enable it and if you have free CPU it will be used to lower your response time. It supports plain and real-time indexes for full-text, filtering and analytical queries. For example, here is how enabling pseudo sharding can make most queries' response time in average about 10x lower on Hacker news curated comments dataset multiplied 100 times (116 million docs in a plain index).
- Debian Bullseye is now supported.
- PQ transactions are now atomic and isolated. Previously PQ transactions support was limited. It enables much faster REPLACE into PQ, especially when you need to replace a lot of rules at once. Performance details:
Previous version 4.0.2
It takes 48 seconds to insert 1M PQ rules and 406 seconds to REPLACE just 40K in 10K batches.
root@perf3 ~ # mysql -P9306 -h0 -e "drop table if exists pq; create table pq (f text, f2 text, j json, s string) type='percolate';"; date; for m in `seq 1 1000`; do (echo -n "insert into pq (id,query,filters,tags) values "; for n in `seq 1 1000`; do echo -n "(0,'@f (cat | ( angry dog ) | (cute mouse)) @f2 def', 'j.json.language=\"en\"', '{\"tag1\":\"tag1\",\"tag2\":\"tag2\"}')"; [ $n != 1000 ] && echo -n ","; done; echo ";")|mysql -P9306 -h0; done; date; mysql -P9306 -h0 -e "select count(*) from pq"
Wed Dec 22 10:24:30 AM CET 2021
Wed Dec 22 10:25:18 AM CET 2021
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
root@perf3 ~ # date; (echo "begin;"; for offset in `seq 0 10000 30000`; do n=0; echo "replace into pq (id,query,filters,tags) values "; for id in `mysql -P9306 -h0 -NB -e "select id from pq limit $offset, 10000 option max_matches=1000000"`; do echo "($id,'@f (tiger | ( angry bear ) | (cute panda)) @f2 def', 'j.json.language=\"de\"', '{\"tag1\":\"tag1\",\"tag2\":\"tag2\"}')"; n=$((n+1)); [ $n != 10000 ] && echo -n ","; done; echo ";"; done; echo "commit;") > /tmp/replace.sql; date
Wed Dec 22 10:26:23 AM CET 2021
Wed Dec 22 10:26:27 AM CET 2021
root@perf3 ~ # time mysql -P9306 -h0 < /tmp/replace.sql
real 6m46.195s
user 0m0.035s
sys 0m0.008s
Previous version 4.2.0
It takes 34 seconds to insert 1M PQ rules and 23 seconds to REPLACE them in 10K batches.
root@perf3 ~ # mysql -P9306 -h0 -e "drop table if exists pq; create table pq (f text, f2 text, j json, s string) type='percolate';"; date; for m in `seq 1 1000`; do (echo -n "insert into pq (id,query,filters,tags) values "; for n in `seq 1 1000`; do echo -n "(0,'@f (cat | ( angry dog ) | (cute mouse)) @f2 def', 'j.json.language=\"en\"', '{\"tag1\":\"tag1\",\"tag2\":\"tag2\"}')"; [ $n != 1000 ] && echo -n ","; done; echo ";")|mysql -P9306 -h0; done; date; mysql -P9306 -h0 -e "select count(*) from pq"
Wed Dec 22 10:06:38 AM CET 2021
Wed Dec 22 10:07:12 AM CET 2021
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
root@perf3 ~ # date; (echo "begin;"; for offset in `seq 0 10000 990000`; do n=0; echo "replace into pq (id,query,filters,tags) values "; for id in `mysql -P9306 -h0 -NB -e "select id from pq limit $offset, 10000 option max_matches=1000000"`; do echo "($id,'@f (tiger | ( angry bear ) | (cute panda)) @f2 def', 'j.json.language=\"de\"', '{\"tag1\":\"tag1\",\"tag2\":\"tag2\"}')"; n=$((n+1)); [ $n != 10000 ] && echo -n ","; done; echo ";"; done; echo "commit;") > /tmp/replace.sql; date
Wed Dec 22 10:12:31 AM CET 2021
Wed Dec 22 10:14:00 AM CET 2021
root@perf3 ~ # time mysql -P9306 -h0 < /tmp/replace.sql
real 0m23.248s
user 0m0.891s
sys 0m0.047s
Minor changes
- optimize_cutoff is now available as a configuration option in section
searchd
. It's useful when you want to limit the RT chunks count in all your indexes to a particular number globally. - Commit 00874743 accurate count(distinct ...) and FACET ... distinct over several local physical indexes (real-time/plain) with identical fields set/order.
- PR #598 bigint support for
YEAR()
and other timestamp functions. - Commit 8e85d4bc Adaptive rt_mem_limit. Previously Manticore Search was collecting exactly up to
rt_mem_limit
of data before saving a new disk chunk to disk, and while saving was still collecting up to 10% more (aka double-buffer) to minimize possible insert suspension. If that limit was also exhausted, adding new documents was blocked until the disk chunk was fully saved to disk. The new adaptive limit is built on the fact that we have auto-optimize now, so it's not a big deal if disk chunks do not fully respectrt_mem_limit
and start flushing a disk chunk earlier. So, now we collect up to 50% ofrt_mem_limit
and save that as a disk chunk. Upon saving we look at the statistics (how much we've saved, how many new documents have arrived while saving) and recalculate the initial rate which will be used next time. For example, if we saved 90 million documents, and another 10 million docs arrived while saving, the rate is 90%, so we know that next time we can collect up to 90% ofrt_mem_limit
before starting flushing another disk chunk. The rate value is calculated automatically from 33.3% to 95%. - Issue #628 unpack_zlib for PostgreSQL source. Thank you, Dmitry Voronin for the contribution.
- Commit 6d54cf2b
indexer -v
and--version
. Previously you could still see indexer's version, but-v
/--version
were not supported. - Issue #662 infinit mlock limit by default when Manticore is started via systemd.
- Commit 63c8cd05 spinlock -> op queue for coro rwlock.
- Commit 41130ce3 environment variable
MANTICORE_TRACK_RT_ERRORS
useful for debugging RT segments corruption.
Breaking changes
- Binlog version was increased, binlog from previous version won't be replayed, so make sure you stop Manticore Search cleanly during upgrade: no binlog files should be in
/var/lib/manticore/binlog/
exceptbinlog.meta
after stopping the previous instance. - Commit 3f659f36 new column "chain" in
show threads option format=all
. It shows stack of some task info tickets, most useful for profiling needs, so if you are parsingshow threads
output be aware of the new column. searchd.workers
was obsoleted since 3.5.0, now it's deprecated, if you still have it in your configuration file it will trigger a warning on start. Manticore Search will start, but with a warning.
Bugfixes
- ❗Issue #650 Manticore 4.0.2 slower than Manticore 3.6.3. 4.0.2 was faster than previous versions in terms of bulk inserts, but significantly slower for single document inserts. It's been fixed in 4.2.0.
- ❗Commit 22f4141b RT index could get corrupted under intensive REPLACE load, or it could crash
- Commit 03be91e4 fixed average at merging groupers and group N sorter; fixed merge of aggregates
- Commit 2ea575d3
indextool --check
could crash - Commit 7ec76d4a RAM exhaustion issue caused by UPDATEs
- [Commit 658a727](https://github.com/manticoresoftware/manticoresearch/co...
Manticore Search 4.0.2
Version 4.0.2, Sep 21st 2021
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
Major new features
-
Full support of Manticore Columnar Library. Previously Manticore Columnar Library was supported only for plain indexes. Now it's supported:
- in real-time indexes for
INSERT
,REPLACE
,DELETE
,OPTIMIZE
- in replication
- in
ALTER
- in
indextool --check
- in real-time indexes for
-
Automatic indexes compaction (#478). Finally you don't have to call OPTIMIZE manually or via a crontask or other kind of automation. Manticore now does it on your own. You can set default compaction threshold via optimize_cutoff.
-
Chunk snapshots and locks system revamp. These changes may be invisible from outside at first glance, but they improve the behaviour of many things happening in real-time indexes significantly. In a nutshell, previously most Manticore data manipulation operations relied on locks heavily, now we use disk chunk snapshots instead.
- read operations (e.g. SELECTs, replication) are performed with snapshots
- operations that just change internal index structure without modifying schema/documents (e.g. merging RAM segments, saving disk chunks, merging disk chunks) are performed with read-only snapshots and replace the existing chunks in the end
- UPDATEs and DELETEs are performed against existing chunks, but for the case of merging that may be happening the writes are collected and are then applied against the new chunks
- UPDATEs acquire an exclusive lock sequentially for every chunk. Merges acquire a shared lock when entering the stage of collecting attributes from the chunk. So at the same time only one (merge or update) operation has access to attributes of the chunk.
- when merging gets to the phase it needs attributes it sets a special flag. When UPDATE finishes it checks the flag and if it's set, the whole update is stored in a special collection. Finally when the merge finishes, it applies the updates set to the newborn disk chunk
- ALTER runs via an exclusive lock
- replication runs as a usual read operation, but in addition saves the attributes before SST and forbids updates during the SST
-
ALTER can add/remove a full-text field. Previously it could only add/remove an attribute.
-
🔬 Experimental: pseudo sharding for full-scan queries - allows to parallelize any non-full-text search query. Instead of preparing shards manually you can now just enable new option searchd.pseudo_sharding and expect up to
CPU cores
lower response time for non-full-text search queries. Note it can easily occupy all existing CPU cores, so if you care not only about latency, but throughput too - use it with caution.
Minor changes
- Linux Mint and Ubuntu Hirsute Hippo are supported via APT repository
- faster update by id via HTTP in big indexes in some cases (depends on the ids distribution)
3.6.0
time curl -X POST -d '{"update":{"index":"idx","id":4611686018427387905,"doc":{"mode":0}}}' -H "Content-Type: application/x-ndjson" http://127.0.0.1:6358/json/bulk
real 0m43.783s
user 0m0.008s
sys 0m0.007s
4.0.2
time curl -X POST -d '{"update":{"index":"idx","id":4611686018427387905,"doc":{"mode":0}}}' -H "Content-Type: application/x-ndjson" http://127.0.0.1:6358/json/bulk
real 0m0.006s
user 0m0.004s
sys 0m0.001s
- custom startup flags for systemd. Now you don't need to start searchd manually in case you need to run Manticore with some specific startup flag
- new function LEVENSHTEIN() which calculates Levenshtein distance
- added new searchd startup flags
--replay-flags=ignore-trx-errors
and--replay-flags=ignore-all-errors
so one can still start searchd if the binlog is corrupted - #621 - expose errors from RE2
- more accurate COUNT(DISTINCT) for distributed indexes consisting of local plain indexes
- FACET DISTINCT to remove duplicates when you do faceted search
- exact form modified doesn't require morphology now and works for indexes with infix/prefix search enabled
Breaking changes
- the new version can read older indexes, but the older versions can't read Manticore 4's indexes
- removed implicit sorting by id. Sort explicitly if required
charset_table
's default value changes from0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F, U+401->U+451, U+451
tonon_cjk
OPTIMIZE
happens automatically. If you don't need it make sure to setauto_optimize=0
in sectionsearchd
in the configuration file- #616
ondisk_attrs_default
were deprecated, now they are removed - for contributors: we now use Clang compiler for Linux builds as according to our tests it can build a faster Manticore Search and Manticore Columnar Library
- if max_matches is not specified in a search query it gets updated implicitly with the lowest needed value for the sake of performance of the new columnar storage. It can affect metric
total
in SHOW META, but nottotal_found
which is the actual number of found documents.
Migration from Manticore 3
- make sure you a stop Manticore 3 cleanly:
- no binlog files should be in
/var/lib/manticore/binlog/
(onlybinlog.meta
should be in the directory) - otherwise the indexes Manticore 4 can't reply binlogs for won't be run
- no binlog files should be in
- the new version can read older indexes, but the older versions can't read Manticore 4's indexes, so make sure you make a backup if you want to be able to rollback the new version easily
- if you run a replication cluster make sure you:
- stop all your nodes first cleanly
- and then start the node which was stopped last with
--new-cluster
(run toolmanticore_new_cluster
in Linux). - read about restarting a cluster for more details
Bugfixes
- Lots of replication issues have been fixed:
- 696f8649 - fixed crash during SST on joiner with active index; added sha1 verify at joiner node at writing file chunks to speed up index loading; added rotation of changed index files at joiner node on index load; added removal of index files at joiner node when active index gets replaced by a new index from donor node; added replication log points at donor node for sending files and chunks
- b296c55a - crash on JOIN CLUSTER in case the address is incorrect
- 418bf880 - while initial replication of a large index the joining node could fail with
ERROR 1064 (42000): invalid GTID, (null)
, the donor could become unresponsive while another node was joining - 6fd350d2 - hash could be calculated wrong for a big index which could result in replication failure
- #615 - replication failed on cluster restart
- #574 -
indextool --help
doesn't display parameter--rotate
- #578 - searchd high CPU usage while idle after ca. a day
- #587 - flush .meta immediately
- #617 - manticore.json gets emptied
- #618 - searchd --stopwait fails under root. It also fixes systemctl behaviour (previously it was showing failure for ExecStop and didn't wait long enough for searchd to stop properly)
- #619 - INSERT/REPLACE/DELETE vs SHOW STATUS.
command_insert
,command_replace
and others were showing wrong metrics - #620 -
charset_table
for a plain index had a wrong default value - [8f75368](https://github.com/manti...