If you happened to read my last blog post you saw that I fixed all the assistants to work with an OpenGL 3.2 Core Profile. Let's take a step back and see why this fix was necessary and what was wrong with the code.
History
In Krita 3.0 we introduced 'Instant preview'. This is a mechanism for speeding up big brush strokes on large canvases and uses OpenGL3. Before this mechanism Krita exclusively used OpenGL2 and below.
A side-effect of OpenGL3 is that it deprecated some functions from older versions. Now, normally this isn't a problem as Windows and Linux support a thing called Compatibility Profile, which allows the user to use new and deprecated functions together.
However on Mac OS this compatibility profile is not supported and they leave us two choices. Either don't use any functionality from OGL3, or remove all deprecated functions from our code.
Our solution
By now you might have guessed that we chose the latter option and set out to remove all deprecated functionality. The problem here is that not all of this legacy code is in Krita, but actually in Qt (which is the library we use for the graphical user-interface). More specifically, we use some functions of Qt which contains legacy code to draw our canvas decorations (Brush outline, Assistants, etc.).
Since we don't have direct control over the Qt code, we decided to copy their legacy code into Krita and use this copied code to implement our fix. This means that drawing the decorations would now make use of our copied code instead of the Qt code. Ultimately however, we don't want to keep using this copied code as it would be a nightmare to keep up-to-date with the current Qt version. Therefore, the plan was to implement our fix and send our patch back to the guys over at Qt for them to merge it into their library.
So what did this fix involve exactly?
Qt:
- Updating every Qt shader to use GLSL 1.5
- Creating a VAO + several VBOs for uploading data to the GPU
- Dynamically switching between old OGL versions and new versions
- Updating Krita shaders to use GLSL 1.5
- Creating appropriate VAO + VBOs for tool outlines and canvas textures
- Dynamically switching between old OGL versions and new versions
As you saw the fix worked well using our copied Qt code, but now the next step was to move these fixes to the current Qt 5.7 code. Of course, Qt 5.7 contained some changes that weren't in our old copied code, so I had to merge my changes manually into the new files. Luckily this all went well and my first custom Qt installation was born.
And then, le moment suprême, as we run Krita with this custom Qt version...
Well.. unfortunately it didn't run...
On start-up Krita complained that there wasn't a valid context bound or that the OpenGL implementation does not support buffers. This happened in a piece of code that is completely unrelated to my fixes, but one that I luckily recently had a look at.
The unnerving thing is that my fix contains nothing that meddles with the OpenGL context and doesn't touch the file that gave the error. What's even worse, when debug printing the current context in that file it looked perfectly intact. So what could possibly be causing this?
Well it turned out that there is no such error when I run Krita without my fix, so it had to be something I had done. Alas, there was nothing left to do but to very slowly remove parts of my fix until the error stopped appearing, while at the same time keeping the code runnable.
Finally, I found the troublesome piece of code. It was already present in Qt and I had commented it out as it is chock-full of deprecated functions. The act of commenting out this piece of code apparently has severe consequences on unrelated files. I have no idea why...
Uncommenting this piece of code no longer caused any issues and fixed the error, soooo... ¯\_(ツ)_/¯
Sending the patch to Qt
Last Wednesday I cleaned up the fixes and sent in a change request to the Qt people. Over the coming days we will discuss the best way to implement parts of it in preparation of them taking in the changes so that we may drop our copied code and just use Qt as-is.
Their vision is to keep support for deprecated functionality, but to also allow the user to pick an OpenGL3.2 Core Profile which removes all these functions. This means I will have to implement checks in the fix to see which profile the user has requested. This incremental preparation of the patch will happen over a couple of weeks as we get closer to a solution we are both happy with.
Bonus talk
As one might imagine it is not a super fast process to update fixes and wait for comments and critique from the patch reviewers. This leaves me with some extra time in my Summer of Code to look at other parts of Krita. In particular, I am interested in the deep dark depths of the Krita painting engine.
The first part of these depths that I looked at was the way in which parts of the canvas are updated as people paint on it. This happens in a tile based manner.
The canvas is divided in tiles of size 256x256 and as paint strokes hit certain tiles only they get updated. An image would look something like this to Krita internally:
You notice I've drawn some red borders around the tiles. These borders represent where we extend each tile by 16 pixels on every side. This tile + border together is a 256x256 texture (so the effective size of the actual image tile is only 224x224).
Why do we extend each tile by 16 pixels? Well we keep what is called 'levels of detail' of the image. Effectively what this means is that we keep lower quality versions of the image (also called mip-maps). These levels of detail are progressively lower in resolution by powers of 2. So if the original image had a resolution of 1024x1024 its mip-maps would be: 512x512, 256x256, 128x128, 64x64 etc.
To see why these levels of detail are useful we have to dive into the implementation of 'Instant preview'. Essentially what that mechanic does is simulate a user's brush stroke on a lower level of detail where it is much faster to calculate and show this preview to the user, while in the background it is applying the brush stroke to the actual image. This gives the user an 'instant preview' of the brush stroke and retains the integrity of the image.
But I still haven't told you about why we need this border around the image. Well this has to do with the filtering we perform. To show a high-quality image at all zoom levels we might apply filters such as bilinear interpolation. For every pixel you see on screen bilinear interpolation takes the 4 pixels closest to the pixel you want to calculate and averages these according to how close they are.
In the image below you see a pixel with an imprecise position (because it has been zoomed in/out) called (x, y) for which we want to calculate the colour, and the 4 closest pixels in the actual image (x1,y1), (x2,y1), (x2, y2) and (x1, y2). The colour of the pixel is then taken as the average of the colour of the other pixels multiplied by the area the pixel directly diagonal from it takes up.
Now you have an idea of how bilinear interpolation works, you might ask yourself how this works when the pixel is at the edge of the image. Because obviously there aren't any pixels outside of the image to sample colours from.
Well this is exactly why we need an extra border of pixels around the image. We need at least one extra pixel around the image in order to handle the corner cases of bilinear interpolation. But what colour should this border be? It should be the colour of the pixel directly next to it! So in a way we are just taking all the pixels of the image edge and copying them to form a 1 pixel border.
But.. we have a 16 pixel border? Here is where the mip-mapping comes in. If we want to have a 1 pixel border at the lowest level of detail, we should have a border that is 2 pixels on the next higher level of detail (LoD). This is the case because if the second-to-lowest LoD is halved in size to form the lowest LoD we end up with a 1 pixel border.
In Krita we store five levels of detail (including the original image) and so we need a 1px, 2px, 4px, 8px and finally a 16px border on the original image.
So far I have been talking about these borders in the context of the image, but actually we need this border of every tile as they are little images that form the complete image. So now you hopefully understand the red lines on the Kiki image.
Speed up
You might be wondering why I am telling you all of this. While going through the code that handles all this tile business I found out that the code that extends each tile by 16 pixels takes up half of the processing time of each tile. This means that when you are drawing, half of the time it spends updating your canvas is spent on extending the tiles a little bit.
Here is my tiny benchmark of updating a full 8000x8000 canvas with and without tile borders:
Time taken on updating 8000x8000 canvas with borders: ~402ms
Time taken on CPU | Time taken on GPU | Total time | |
---|---|---|---|
1 | 123ms | 106.3ms | 229.3ms |
2 | 237ms | 208.4ms | 445.4ms |
3 | 140ms | 135.7ms | 275.7ms |
4 | 155ms | 148.1ms | 303.1ms |
5 | 325ms | 256.3ms | 581.3ms |
6 | 279ms | 249.7ms | 528.7ms |
7 | 237ms | 208.2ms | 445.2ms |
8 | 283ms | 267.2ms | 550.2ms |
9 | 225ms | 209.0ms | 434.0ms |
10 | 122ms | 109.8ms | 231.8ms |
Time taken on updating 8000x8000 canvas without borders: ~194ms
Time taken on CPU | Time taken on GPU | Total time | |
---|---|---|---|
1 | 55ms | 52.8ms | 107.8ms |
2 | 87ms | 123.8ms | 210.8ms |
3 | 82ms | 125.3ms | 207.3ms |
4 | 46ms | 122.8ms | 168.8ms |
5 | 245ms | 197.3ms | 442.3ms |
6 | 53ms | 45.6ms | 98.6ms |
7 | 46ms | 125.1ms | 171.1ms |
8 | 47ms | 122.9ms | 169.9ms |
9 | 50ms | 122.9ms | 172.9ms |
10 | 61ms | 124.7ms | 185.7ms |
I think the current implementation of this extending has a lot of opportunity to be optimised. So the time I have left while waiting for Qt critique I will spend on trying to get this border extension implementation optimised and possible getting a nice speed-up on the painting. I doubt it will be twice as fast, because I am sure there is a lot of other things going on during a paint stroke, but it will at least go some way to squeezing more performance out of Krita.
thanks, interesting and well written :)
ReplyDeleteAren't glMatrixMode and stuff deprecated in GL3? You should setup all MVP matrices without any gl call (usually glm is used for this as it supports) and send them shaders as any other parameter.
ReplyDeleteYes that is correct, which is why I originally commented it out. It should be noted however that this piece of code is in an if-statement that checks whether we are not using GL3.2+. So it is even weirder how something that shouldn't execute in the first place has an effect on unrelated code.
DeleteI should also say that Qt doesn't use any matrices in their shaders and that is fine. But if they did they would probably use their own QMatrix class.
The Mesa-based open-source drivers don't support compat profiles either, so this should improve the experience for many Linux users.
ReplyDeleteNice. :-)