How does georeferencing work without GCP?

Hello everyone, I have questions about how the PIX4DMapper software works.

I have taken RGB drone shots of agricultural land with a drone equipped with RTK. I have processed these images with PIX4DMapper and for my purposes I have obtained good orthophotos, DSM and DTM.
Now I have had the same drone images processed with WebODM and also obtained orthophotos.
I noticed that there are some significant differences.
The main difference that made me suspicious is that the images are scaled differently in the coordinate system (and yes, that is the same) and the same locations in the photo have different coordinates.
So I wonder how the georeferencing of the orthophotos works without GCP.
Yes, the PIX4DMapper has the GPS coordinates and alignment metadata of the individual images taken, which means that the images can also be arranged spatially.
However, I assume that the calculation of the individual images into an orthophoto is a purely photogrammetric method and that the individual pixels or linking points are not georeferenced. Am I correct in my assumption?

So how is it possible that the orthophoto is georeferenced, but the individual key points are determined using photogrammetry?
And why do I get two orthomaps with two different programmes with very different georeferencing?
Thanks for your help in advance

YOu left out way too much information.

Were you receiving corrections to your RTK drone? If not then your images georeferencing can be measured in meters, not centimeters. This could account for your problem.

Your initial sparse cloud DOES have georeferncing to each tie point.

I have never used Web ODM so I cannot speak on it.

You may also have a datum problem going on where WED ODM and Pix4D have projects where you chose the incorrect datum for images and/or your output.

With an RTK drone receiving corrections you can indeed obtain low centimeter accuracy WITHOUT GCPs. But you will need to ensure that you either using some or all oblique images to not have a vertical shift. Also you SHOULD always have GCPs used as check points to “check” the accuracy of your outputs.