Jonas Bonn
2013-07-30 12:35:01 UTC
Hi,
This is a bit of a braindump but hopefully it's reasonably coherent.
A discussion on IRC leads me to believe that we can get "huge pages" on
OpenRISC given what we have today. This is without using the ATB mechanism.
The arch spec allows PTE's to be located either in a level-1 page
directory or in a level-2 page table. A bit "L" in the page directory
entry (level 1) indicates whether the entry points to a page containing
a page table or whether it points to a "huge page". A huge page has a
24 bit offset is thus 16MB in size.
When a page is "huge", the TLB needs to know about it. That's what the
PL1 bit is for. I'd like to see this bit renamed HUGE in order to
indicate that it's just matching the high 8-bits of the page frame when
looking for a translation.
An example user of the "huge page" mechanism would be the Linux kernel
which maps itself into contiguous physical memory from 0 to
end_of_kernel. If we carefully manage the fact that it's not using 16MB
of physical memory, we could use the "huge page" mechanism to prevent a
lot of TLB misses when accessing kernel space code and data.
(Of course, 16MB might actually be too large for reasonable huge
pages... 2MB or 1MB might be better, see end of this mail)
For this to work, the PL1 bit would need to be implemented... the fact
that it's not today is a bug in all our implementations as it's not an
optional feature.
Some changes along these lines that may be needed in the arch spec are:
8.4.1 DMMUCR
PTBP should be bits 31-13, not 31-10... page frames are always 8kB in
size and need to be page aligned
8.4.2 DMMUPR
Drop this register altogether (see 8.8 below). 4 bits in each set gives
16 combinations, but many of these really don't make sense so this
flexibility really isn't needed.
8.4.3 IMMUCR
PTBP should be bits 31-13, not 31-10... page frames are always 8kB in
size and pretty much need to be page aligned
8.4.4 IMMUPR
Drop this register altogether (see 8.8 below)
Note that this register is overdimensioned... it has 7 sets with 2 bits
each.
8.4.6
Change name of PL1 to HUGE with description:
0: normal page, 8kB
1: huge page, 16MB (or 2MB, see below)
Change LRU from "last recently used" to "least recently used" (cosmetic)
8.4.9 - 8.4.11
Drop ATB's altogether. We can get 16MB pages without them and the 32GB
pages aren't realistic anyway.
8.8 PTE
Change PPN size to 19 bits (bits 31-13).
PPI: Why only 7 sets of protection bits? Why not 8? Because value 0
is overloaded to mean the entry is invalid, but this prevents the field
from being used as a sane bitmask. Change the PPI field to 3 individual
bits indicating Writable, User access, and Executability and drop the
Protection Registers altogether.
As per Stefan's earlier mail, make PTE something like this:
| 31 ... 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| PPN |OS-specific|Present| L | X | W | U | D | A |WOM|WBC|CI |CC |
...and we need a VALID bit in there somewhere.
----------------------
So how do we get 2MB huge pages... here's my suggestion.
Top-level page directory
---------------------------
0x0000... | 8-bit index entry, L=0 |
---------------------------
0x0020... | Empty |
to ~ ... ~
0x00e0... | Empty |
---------------------------
0x0100... | Next 8-bit index entry, L=1 |
---------------------------
0x0120... | 2 MB page entry (L=1) |
to ~ ... ~
0x01e0... | 2 MB page entry (L=1) |
---------------------------
0x0200... | Next 8-bit index entry, L=0 |
| ~~~ |
~ ~
The top-level page directory is an 8kB page, and it's 8-bit indexing
makes it sparsely populated. If we find that the L bit (huge page) is
set on an 8-bit indexed entry, then we could do a second indexing on the
remaining three bits (11 bit index total) to find the entry to the 2MB
huge page in the "free space".
This could get us 2MB huge pages and we could then keep the ATB stuff
around for the less useful 16MB huge pages.
This all plays reasonably nicely with the arch spec we've got today.
What would need clarifying is that these huge pages are 2MB and not
16MB, but this is all so vague in the spec as it stands and otherwise
unimplemented in practice that it ought to be doable.
Looking forward to comments!
/Jonas
This is a bit of a braindump but hopefully it's reasonably coherent.
A discussion on IRC leads me to believe that we can get "huge pages" on
OpenRISC given what we have today. This is without using the ATB mechanism.
The arch spec allows PTE's to be located either in a level-1 page
directory or in a level-2 page table. A bit "L" in the page directory
entry (level 1) indicates whether the entry points to a page containing
a page table or whether it points to a "huge page". A huge page has a
24 bit offset is thus 16MB in size.
When a page is "huge", the TLB needs to know about it. That's what the
PL1 bit is for. I'd like to see this bit renamed HUGE in order to
indicate that it's just matching the high 8-bits of the page frame when
looking for a translation.
An example user of the "huge page" mechanism would be the Linux kernel
which maps itself into contiguous physical memory from 0 to
end_of_kernel. If we carefully manage the fact that it's not using 16MB
of physical memory, we could use the "huge page" mechanism to prevent a
lot of TLB misses when accessing kernel space code and data.
(Of course, 16MB might actually be too large for reasonable huge
pages... 2MB or 1MB might be better, see end of this mail)
For this to work, the PL1 bit would need to be implemented... the fact
that it's not today is a bug in all our implementations as it's not an
optional feature.
Some changes along these lines that may be needed in the arch spec are:
8.4.1 DMMUCR
PTBP should be bits 31-13, not 31-10... page frames are always 8kB in
size and need to be page aligned
8.4.2 DMMUPR
Drop this register altogether (see 8.8 below). 4 bits in each set gives
16 combinations, but many of these really don't make sense so this
flexibility really isn't needed.
8.4.3 IMMUCR
PTBP should be bits 31-13, not 31-10... page frames are always 8kB in
size and pretty much need to be page aligned
8.4.4 IMMUPR
Drop this register altogether (see 8.8 below)
Note that this register is overdimensioned... it has 7 sets with 2 bits
each.
8.4.6
Change name of PL1 to HUGE with description:
0: normal page, 8kB
1: huge page, 16MB (or 2MB, see below)
Change LRU from "last recently used" to "least recently used" (cosmetic)
8.4.9 - 8.4.11
Drop ATB's altogether. We can get 16MB pages without them and the 32GB
pages aren't realistic anyway.
8.8 PTE
Change PPN size to 19 bits (bits 31-13).
PPI: Why only 7 sets of protection bits? Why not 8? Because value 0
is overloaded to mean the entry is invalid, but this prevents the field
from being used as a sane bitmask. Change the PPI field to 3 individual
bits indicating Writable, User access, and Executability and drop the
Protection Registers altogether.
As per Stefan's earlier mail, make PTE something like this:
| 31 ... 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
| PPN |OS-specific|Present| L | X | W | U | D | A |WOM|WBC|CI |CC |
...and we need a VALID bit in there somewhere.
----------------------
So how do we get 2MB huge pages... here's my suggestion.
Top-level page directory
---------------------------
0x0000... | 8-bit index entry, L=0 |
---------------------------
0x0020... | Empty |
to ~ ... ~
0x00e0... | Empty |
---------------------------
0x0100... | Next 8-bit index entry, L=1 |
---------------------------
0x0120... | 2 MB page entry (L=1) |
to ~ ... ~
0x01e0... | 2 MB page entry (L=1) |
---------------------------
0x0200... | Next 8-bit index entry, L=0 |
| ~~~ |
~ ~
The top-level page directory is an 8kB page, and it's 8-bit indexing
makes it sparsely populated. If we find that the L bit (huge page) is
set on an 8-bit indexed entry, then we could do a second indexing on the
remaining three bits (11 bit index total) to find the entry to the 2MB
huge page in the "free space".
This could get us 2MB huge pages and we could then keep the ATB stuff
around for the less useful 16MB huge pages.
This all plays reasonably nicely with the arch spec we've got today.
What would need clarifying is that these huge pages are 2MB and not
16MB, but this is all so vague in the spec as it stands and otherwise
unimplemented in practice that it ought to be doable.
Looking forward to comments!
/Jonas