{"id":40525,"date":"2026-06-09T16:51:27","date_gmt":"2026-06-09T13:51:27","guid":{"rendered":"https:\/\/www.thrivedesk.com\/?p=40525"},"modified":"2026-06-09T16:51:35","modified_gmt":"2026-06-09T13:51:35","slug":"the-case-of-meilisearch-missing-id","status":"publish","type":"post","link":"https:\/\/www.thrivedesk.com\/es\/the-case-of-meilisearch-missing-id\/","title":{"rendered":"The Case of Meilisearch Missing ID"},"content":{"rendered":"<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"has-text-align-center wp-block-paragraph\"><em>everything works until you get clever with indexing<\/em><\/p>\n<\/blockquote>\n\n\n\n<p class=\"wp-block-paragraph\">At ThriveDesk, we move a lot of customer conversations, and we use Meilisearch to keep searching over them fast. One quiet afternoon, in the middle of a reindex job, it threw this at me:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>Document doesn't have an <span style=\"background-color: initial;font-family: inherit;font-size: inherit;text-align: initial\"><strong>`id`<\/strong><\/span> attribute<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Every document in the batch had an <code><em>id<\/em><\/code>, so a &#8220;<em><code>document doesn't have an id attribute<\/code><\/em>&#8221; error made no sense. I checked twice. The error turned out to be entirely my fault, and fixing it meant dropping one of Laravel Scout&#8217;s conveniences and talking to Meilisearch directly. Here&#8217;s what happened.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Why not just use Laravel Scout?<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><a href=\"https:\/\/laravel.com\/docs\/scout\" target=\"_blank\" rel=\"noreferrer noopener\">Scout<\/a> is the easy path when all you want is &#8220;<em><code>keep this model in sync with the index<\/code><\/em>&#8221; to mark the model searchable and forget about it. The catch is that Scout hides the thing we care about once volume goes up, the <em>task ID<\/em>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Every async operation in Meilisearch, like adding or deleting documents, returns a <code>taskUid<\/code>. You can poll that ID to find out whether the work actually landed. Scout doesn&#8217;t surface it. When you&#8217;re pushing thousands of conversations through a pipeline, &#8220;<code><em>I called searchable() , so it probably worked<\/em><\/code>&#8221; isn&#8217;t a monitoring strategy. We wanted to record every enqueued task, retry those that failed, and check the actual state of the index.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So we skip Scout&#8217;s helper and call Meilisearch&#8217;s <code><a href=\"https:\/\/github.com\/meilisearch\/meilisearch-php\/blob\/c6517510d7fc7c8d92fbe476e3d942ca2d03915e\/src\/Endpoints\/Delegates\/HandlesDocuments.php#L46\" target=\"_blank\" rel=\"noreferrer noopener\"><em>addDocuments()<\/em><\/a><\/code> y <code><a href=\"https:\/\/github.com\/meilisearch\/meilisearch-php\/blob\/c6517510d7fc7c8d92fbe476e3d942ca2d03915e\/src\/Endpoints\/Delegates\/HandlesDocuments.php#L195\" target=\"_blank\" rel=\"noopener\"><em>deleteDocuments()<\/em><\/a><\/code> ourselves through the underlying engine, then grab the <code>taskUid<\/code> and store it.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The actual bug: a sparse array<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">I was batching conversations to index, and some of them were duplicates, so I dropped the duplicates with <code><em><a href=\"https:\/\/laravel.com\/docs\/master\/collections#method-unique\" target=\"_blank\" rel=\"noopener\">-&gt;unique()<\/a><\/em><\/code>. The result looked like this:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$filtered = collect(&#091;\n    0 =&gt; &#091;'id' =&gt; 1],\n    1 =&gt; &#091;'id' =&gt; 2],\n    3 =&gt; &#091;'id' =&gt; 4], \/\/ index 2 is gone\n]);<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Every element still has an <code><em>id<\/em><\/code>, so at a glance it looks fine. But look at the keys: <em>0, 1, 3<\/em>. When <code><em>unique()<\/em><\/code> removed the duplicate at index <em>2<\/em>, it kept the original keys and left a hole. The array is now sparse, and that hole is the whole problem. The reason it breaks is <strong>PHP<\/strong>, not <strong>Meilisearch<\/strong>.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">En<em> <code><a href=\"https:\/\/www.php.net\/manual\/en\/function.json-encode.php\" target=\"_blank\" rel=\"noreferrer noopener\">json_encode()<\/a><\/code> <\/em>gets an array whose integer keys run <em>0, 1, 2, \u2026<\/em> with no gaps, it outputs a JSON array:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&#091;{\"id\":1},{\"id\":2},{\"id\":4}]<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Put a gap in those keys, and PHP can&#8217;t treat it as a list anymore, so it falls back to encoding a JSON object instead:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>{\"0\":{\"id\":1},\"1\":{\"id\":2},\"3\":{\"id\":4}}<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><code><em>addDocuments()<\/em><\/code> expects an array of documents. Give it an object, and it reads the entire payload as a single document, whose top level fields are <em>&#8220;0&#8221;, &#8220;1&#8221;, <\/em>y<em> &#8220;3&#8221;<\/em>. That one <em><code>document<\/code> <\/em>has no <em><code>id<\/code> <\/em>field, so the error was telling the literal truth. I&#8217;d handed it the wrong shape.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The fix<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><code><em><a href=\"https:\/\/laravel.com\/docs\/13.x\/collections#method-values\" target=\"_blank\" rel=\"noopener\">-&gt;values()<\/a><\/em><\/code> throws away the keys and renumbers them from zero, which is enough to get a JSON array back:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$task = $engine-&gt;index($this-&gt;index)-&gt;addDocuments(\n    $uniqueSearchableDocuments-&gt;filter()-&gt;values()-&gt;all(),\n    (new Conversation)-&gt;getKeyName()\n);<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Same thing on the delete path:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$task = $engine-&gt;getIndex($this-&gt;index)-&gt;deleteDocuments(\n    $shouldMakeUnsearchable-&gt;values()-&gt;all()\n);<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The second argument <code><em>addDocuments()<\/em><\/code> is the primary key name. We read it off the model <code><em>getKeyName()<\/em><\/code> instead of hardcoding <code><em>id<\/em><\/code>, so it keeps working if that ever changes.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The Main Takeaways<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><code><em><a href=\"https:\/\/laravel.com\/docs\/13.x\/collections#method-filter\" target=\"_blank\" rel=\"noopener\">filter()<\/a><\/em><\/code> y <code><em><a href=\"https:\/\/laravel.com\/docs\/13.x\/collections#method-unique\" target=\"_blank\" rel=\"noopener\">unique()<\/a><\/em><\/code> keep the collection&#8217;s original keys. That&#8217;s deliberate, and inside Laravel it&#8217;s usually what you want. It stops being what you want the moment the data leaves PHP for an external API, so get in the habit of calling <code><em>-&gt;values()<\/em><\/code> after either one when you&#8217;re about to serialize.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">It&#8217;s also a decent reminder that bugs like this live at the boundary between systems. It read like a Meilisearch problem right up until I looked at the actual JSON, at which point it was clearly a serialization problem I&#8217;d made myself.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And if you actually need to know whether your indexing worked, <code><em><a href=\"https:\/\/laravel.com\/docs\/13.x\/scout#adding-records-via-query\" target=\"_blank\" rel=\"noopener\">searchable()<\/a><\/em><\/code> won&#8217;t tell you. Calling the Meilisearch methods directly is more code, but it&#8217;s the only way to get the <em><code>taskUid<\/code> <\/em>and build retries on top of it.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How the pipeline runs in production<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">A <code>BatchTask<\/code> row drives each batch. Searchable tasks load the conversations, dedupe them, reset the keys with <code><em>values()<\/em><\/code>, call <em><code>addDocuments<\/code>()<\/em>, and save the returned <code>taskUid<\/code>. Unsearchable tasks load the targets, reset keys, call <em><code>deleteDocuments<\/code>()<\/em>, and save the <em><code>taskUid<\/code> <\/em>same way. The diagram below shows both paths.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img decoding=\"async\" width=\"2560\" height=\"1306\" src=\"https:\/\/www.thrivedesk.com\/wp-content\/uploads\/2026\/06\/meilisearch-workflow-1-scaled.png\" alt=\"\" class=\"wp-image-40529\" title=\"\" srcset=\"https:\/\/www.thrivedesk.com\/wp-content\/uploads\/2026\/06\/meilisearch-workflow-1-scaled.png 2560w, https:\/\/www.thrivedesk.com\/wp-content\/uploads\/2026\/06\/meilisearch-workflow-1-768x392.png 768w, https:\/\/www.thrivedesk.com\/wp-content\/uploads\/2026\/06\/meilisearch-workflow-1-1536x784.png 1536w\" sizes=\"(max-width: 2560px) 100vw, 2560px\" \/><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">Tracking the <em><code>taskUid<\/code> <\/em>per batch has paid off. We can tell whether a batch succeeded or failed instead of assuming it worked. Failed tasks get retried with an attempt counter, and the <code><em>values()<\/em><\/code> fix also cleared out a pile of false positive errors that had been adding noise to our logs.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Closing thought<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Search infrastructure looks simple until you run it at scale, and then the small assumptions are the ones that get you. This was a Laravel collection that looked like valid JSON and wasn&#8217;t, the moment it crossed into Meilisearch.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So if you take one thing from this, after you filter or dedupe a collection that&#8217;s headed for an external system, call <code><em>-&gt;values()<\/em><\/code>. It&#8217;s cheap, and it&#8217;ll save you an afternoon.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\" \/>\n\n\n\n<p class=\"wp-block-paragraph\"><em>We do a fair amount of this at ThriveDesk. Like batching, async task tracking, and embeddings for semantic search. If you&#8217;re wiring up Laravel and Meilisearch and want to compare notes, get in touch.<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>","protected":false},"excerpt":{"rendered":"<p>everything works until you get clever with indexing At ThriveDesk, we move a lot of customer conversations, and we use Meilisearch to keep searching over them fast. One quiet afternoon, in the middle of a reindex job, it threw this at me: Every document in the batch had an id, so a &#8220;document doesn&#8217;t have [&hellip;]<\/p>\n","protected":false},"author":17,"featured_media":40526,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[87],"tags":[],"class_list":["post-40525","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-engineering"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.thrivedesk.com\/es\/wp-json\/wp\/v2\/posts\/40525","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.thrivedesk.com\/es\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.thrivedesk.com\/es\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.thrivedesk.com\/es\/wp-json\/wp\/v2\/users\/17"}],"replies":[{"embeddable":true,"href":"https:\/\/www.thrivedesk.com\/es\/wp-json\/wp\/v2\/comments?post=40525"}],"version-history":[{"count":28,"href":"https:\/\/www.thrivedesk.com\/es\/wp-json\/wp\/v2\/posts\/40525\/revisions"}],"predecessor-version":[{"id":40556,"href":"https:\/\/www.thrivedesk.com\/es\/wp-json\/wp\/v2\/posts\/40525\/revisions\/40556"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.thrivedesk.com\/es\/wp-json\/wp\/v2\/media\/40526"}],"wp:attachment":[{"href":"https:\/\/www.thrivedesk.com\/es\/wp-json\/wp\/v2\/media?parent=40525"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.thrivedesk.com\/es\/wp-json\/wp\/v2\/categories?post=40525"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.thrivedesk.com\/es\/wp-json\/wp\/v2\/tags?post=40525"}],"curies":[{"name":"gracias","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}