Discussion:
[ORLinux] [PATCH] Request for comments - new atomic functions
Christian Svensson
2013-02-13 09:48:31 UTC
Permalink
1) Is it worth doing some sort of call table?

2) We want to validate ptr (and ptr2 in the SWAP case).
This should probably be done before IRQs are disabled to handle page
faults and whatnot. I'm open for ideas how to implement this.
---
arch/openrisc/kernel/entry.S | 165 ++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 162 insertions(+), 3 deletions(-)

diff --git a/arch/openrisc/kernel/entry.S b/arch/openrisc/kernel/entry.S
index e20c8dd..eb170d0 100644
--- a/arch/openrisc/kernel/entry.S
+++ b/arch/openrisc/kernel/entry.S
@@ -1112,16 +1112,88 @@ ENTRY(sys_rt_sigreturn)
* the atomic in question. Currently, this function implements the
* following variants:
*
- * XCHG:
+ * SWAP:
* @flag: 1
* @ptr1:
* @ptr2:
- * Atomically exchange the values in pointers 1 and 2.
+ * Atomically swap the values in pointers 1 and 2.
+ *
+ * CMPXCHG:
+ * @flag: 2
+ * @ptr: mem
+ * @val1: old
+ * @val2: new
+ * Writes new to *mem if *mem == old. Returns old *mem.
+ *
+ * XCHG:
+ * @flag: 3
+ * @ptr: mem
+ * @val1: new
+ * Store NEW in *MEM and return the old value.
+ *
+ * ADD:
+ * @flag: 4
+ * @ptr: mem
+ * @val1: val
+ * Add VAL to *MEM and return the old value of *MEM.
+ *
+ * DECPOS:
+ * @flag: 5
+ * @ptr: mem
+ * Decrement *MEM if it is > 0, and return the old value.
+ *
+ * AND:
+ * @flag: 6
+ * @ptr: mem
+ * @val1: mask
+ * Atomically *mem &= mask and return the old value of *mem.
+ *
+ * OR:
+ * @flag: 7
+ * @ptr: mem
+ * @val1: mask
+ * Atomically *mem |= mask and return the old value of *mem.
+ *
+ * UMAX: unsigned
+ * @flag: 8
+ * @ptr: mem
+ * @val1: val
+ * If *mem < val, set *mem = max. Returns old value of *mem.
+ *
+ * UMIN: unsigned
+ * @flag: 9
+ * @ptr: mem
+ * @val1: mask
+ * If *mem > val, set *mem = max. Returns old value of *mem.
*
*/

ENTRY(sys_or1k_atomic)
- /* FIXME: This ignores r3 and always does an XCHG */
+ /* TODO: validate mem ptr(s) */
+ /* TODO: use some kind of jmp table? */
+ l.sfeqi r3,1
+ l.bf _atomic_swap
+ l.sfeqi r3,2
+ l.bf _atomic_cmpxchg
+ l.sfeqi r3,3
+ l.bf _atomic_xchg
+ l.sfeqi r3,4
+ l.bf _atomic_add
+ l.sfeqi r3,5
+ l.bf _atomic_decpos
+ l.sfeqi r3,6
+ l.bf _atomic_and
+ l.sfeqi r3,7
+ l.bf _atomic_or
+ l.sfeqi r3,8
+ l.bf _atomic_and
+ l.sfeqi r3,9
+ l.nop
+
+ l.jr r9
+ l.ori r11,r0,-EINVAL
+
+_atomic_swap:
DISABLE_INTERRUPTS(r17,r19)
l.lwz r29,0(r4)
l.lwz r27,0(r5)
@@ -1131,4 +1203,91 @@ ENTRY(sys_or1k_atomic)
l.jr r9
l.or r11,r0,r0

+_atomic_cmpxchg:
+ DISABLE_INTERRUPTS(r17,r19)
+ l.lwz r29,0(r4)
+ l.sfeq r29,r5
+ l.bnf _atomic_cmpxchg_done
+ l.nop
+
+ l.sw 0(r4),r6
+_atomic_cmpxchg_done:
+ ENABLE_INTERRUPTS(r17)
+ l.jr r9
+ l.or r11,r29,r0
+
+_atomic_xchg:
+ DISABLE_INTERRUPTS(r17,r19)
+ l.lwz r29,0(r4)
+ l.sw 0(r4),r5
+ ENABLE_INTERRUPTS(r17)
+ l.jr r9
+ l.or r11,r29,r0
+
+_atomic_add:
+ DISABLE_INTERRUPTS(r17,r19)
+ l.lwz r29,0(r4)
+ l.add r27,r29,r5
+ l.sw 0(r4),r27
+ ENABLE_INTERRUPTS(r17)
+ l.jr r9
+ l.or r11,r29,r0
+
+_atomic_decpos:
+ DISABLE_INTERRUPTS(r17,r19)
+ l.lwz r29,0(r4)
+ l.sfgtsi r29,0
+ l.bnf _atomic_decpos_done
+ l.addi r27,r29,-1
+
+ l.sw 0(r4),r27
+_atomic_decpos_done:
+ ENABLE_INTERRUPTS(r17)
+ l.jr r9
+ l.or r11,r29,r0
+
+_atomic_and:
+ DISABLE_INTERRUPTS(r17,r19)
+ l.lwz r29,0(r4)
+ l.and r27,r29,r5
+ l.sw 0(r4),r27
+ ENABLE_INTERRUPTS(r17)
+ l.jr r9
+ l.or r11,r29,r0
+
+_atomic_or:
+ DISABLE_INTERRUPTS(r17,r19)
+ l.lwz r29,0(r4)
+ l.or r27,r29,r5
+ l.sw 0(r4),r27
+ ENABLE_INTERRUPTS(r17)
+ l.jr r9
+ l.or r11,r29,r0
+
+_atomic_max:
+ DISABLE_INTERRUPTS(r17,r19)
+ l.lwz r29,0(r4)
+ l.sfltu r29,r5
+ l.bnf _atomic_max_done
+ l.nop
+
+ l.sw 0(r4),r5
+_atomic_max_done:
+ ENABLE_INTERRUPTS(r17)
+ l.jr r9
+ l.or r11,r29,r0
+
+_atomic_min:
+ DISABLE_INTERRUPTS(r17,r19)
+ l.lwz r29,0(r4)
+ l.sfgtu r29,r5
+ l.bnf _atomic_min_done
+ l.nop
+
+ l.sw 0(r4),r5
+_atomic_min_done:
+ ENABLE_INTERRUPTS(r17)
+ l.jr r9
+ l.or r11,r29,r0
+
/* ============================================================[ EOF ]=== */
--
1.8.1.2
Christian Svensson
2013-02-13 09:56:21 UTC
Permalink
*sigh* copy paste errors.
+ l.sfeqi r3,8
+ l.bf _atomic_and
Should be _atomic_max
+ l.sfeqi r3,9
Missing l.bf _atomic_min
+ l.nop
+
Fixed entry for readability:

ENTRY(sys_or1k_atomic)
/* TODO: validate mem ptr(s) */
/* TODO: use some kind of jmp table? */
l.sfeqi r3,1
l.bf _atomic_swap
l.sfeqi r3,2
l.bf _atomic_cmpxchg
l.sfeqi r3,3
l.bf _atomic_xchg
l.sfeqi r3,4
l.bf _atomic_add
l.sfeqi r3,5
l.bf _atomic_decpos
l.sfeqi r3,6
l.bf _atomic_and
l.sfeqi r3,7
l.bf _atomic_or
l.sfeqi r3,8
l.bf _atomic_max
l.sfeqi r3,9
l.bf _atomic_min
l.nop
Jonas Bonn
2013-02-14 17:09:15 UTC
Permalink
Hi Christian,
Post by Christian Svensson
1) Is it worth doing some sort of call table?
Yes, I think so... see below.
Post by Christian Svensson
2) We want to validate ptr (and ptr2 in the SWAP case).
This should probably be done before IRQs are disabled to handle page
faults and whatnot. I'm open for ideas how to implement this.
Yeah, you should probably validate the pointers with something akin to
the access_ok macro.
Post by Christian Svensson
---
arch/openrisc/kernel/entry.S | 165 ++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 162 insertions(+), 3 deletions(-)
diff --git a/arch/openrisc/kernel/entry.S b/arch/openrisc/kernel/entry.S
index e20c8dd..eb170d0 100644
--- a/arch/openrisc/kernel/entry.S
+++ b/arch/openrisc/kernel/entry.S
@@ -1112,16 +1112,88 @@ ENTRY(sys_rt_sigreturn)
* the atomic in question. Currently, this function implements the
*
- * Atomically exchange the values in pointers 1 and 2.
+ * Atomically swap the values in pointers 1 and 2.
How about putting something else in front of this one so that we get
operation "0" as well? Seems strange to begin with "1" and a
lookup-table is more logical to implement if we begin with 0.
Post by Christian Svensson
+ *
+ * Writes new to *mem if *mem == old. Returns old *mem.
+ *
+ * Store NEW in *MEM and return the old value.
+ *
+ * Add VAL to *MEM and return the old value of *MEM.
+ *
+ * Decrement *MEM if it is > 0, and return the old value.
+ *
+ * Atomically *mem &= mask and return the old value of *mem.
+ *
+ * Atomically *mem |= mask and return the old value of *mem.
+ *
+ * UMAX: unsigned
+ * If *mem < val, set *mem = max. Returns old value of *mem.
+ *
+ * UMIN: unsigned
+ * If *mem > val, set *mem = max. Returns old value of *mem.
*
*/
ENTRY(sys_or1k_atomic)
- /* FIXME: This ignores r3 and always does an XCHG */
+ /* TODO: validate mem ptr(s) */
+ /* TODO: use some kind of jmp table? */
+ l.sfeqi r3,1
+ l.bf _atomic_swap
+ l.sfeqi r3,2
+ l.bf _atomic_cmpxchg
+ l.sfeqi r3,3
+ l.bf _atomic_xchg
+ l.sfeqi r3,4
+ l.bf _atomic_add
+ l.sfeqi r3,5
+ l.bf _atomic_decpos
+ l.sfeqi r3,6
+ l.bf _atomic_and
+ l.sfeqi r3,7
+ l.bf _atomic_or
+ l.sfeqi r3,8
+ l.bf _atomic_and
+ l.sfeqi r3,9
+ l.nop
+
I think we could do this more efficiently if we do a simple lookup:
l.movhi r19,hi(atomic_table)
l.ori r19,r19,lo(atomic_table)
l.slri r3,1
l.add r19,r19,r3
l.j r19
l.nop
atomic_table:
l.j _atomic_swap
l.nop
l.j _atomic_xchg
l.nop
...etc.

...or something along those lines. If nothing else, it gives O(1)
transition time to the actual operation in question. If we want to be
clever we could even fill those l.nop's with something useful, but
that's not strictly necessary in a first iteration.

The rest looks pretty good to me... you already corrected this bits you
missed below in another patch.

I'll await a v2 of this patch from you before commenting further.

/Jonas
Post by Christian Svensson
+ l.jr r9
+ l.ori r11,r0,-EINVAL
+
DISABLE_INTERRUPTS(r17,r19)
l.lwz r29,0(r4)
l.lwz r27,0(r5)
@@ -1131,4 +1203,91 @@ ENTRY(sys_or1k_atomic)
l.jr r9
l.or r11,r0,r0
+ DISABLE_INTERRUPTS(r17,r19)
+ l.lwz r29,0(r4)
+ l.sfeq r29,r5
+ l.bnf _atomic_cmpxchg_done
+ l.nop
+
+ l.sw 0(r4),r6
+ ENABLE_INTERRUPTS(r17)
+ l.jr r9
+ l.or r11,r29,r0
+
+ DISABLE_INTERRUPTS(r17,r19)
+ l.lwz r29,0(r4)
+ l.sw 0(r4),r5
+ ENABLE_INTERRUPTS(r17)
+ l.jr r9
+ l.or r11,r29,r0
+
+ DISABLE_INTERRUPTS(r17,r19)
+ l.lwz r29,0(r4)
+ l.add r27,r29,r5
+ l.sw 0(r4),r27
+ ENABLE_INTERRUPTS(r17)
+ l.jr r9
+ l.or r11,r29,r0
+
+ DISABLE_INTERRUPTS(r17,r19)
+ l.lwz r29,0(r4)
+ l.sfgtsi r29,0
+ l.bnf _atomic_decpos_done
+ l.addi r27,r29,-1
+
+ l.sw 0(r4),r27
+ ENABLE_INTERRUPTS(r17)
+ l.jr r9
+ l.or r11,r29,r0
+
+ DISABLE_INTERRUPTS(r17,r19)
+ l.lwz r29,0(r4)
+ l.and r27,r29,r5
+ l.sw 0(r4),r27
+ ENABLE_INTERRUPTS(r17)
+ l.jr r9
+ l.or r11,r29,r0
+
+ DISABLE_INTERRUPTS(r17,r19)
+ l.lwz r29,0(r4)
+ l.or r27,r29,r5
+ l.sw 0(r4),r27
+ ENABLE_INTERRUPTS(r17)
+ l.jr r9
+ l.or r11,r29,r0
+
+ DISABLE_INTERRUPTS(r17,r19)
+ l.lwz r29,0(r4)
+ l.sfltu r29,r5
+ l.bnf _atomic_max_done
+ l.nop
+
+ l.sw 0(r4),r5
+ ENABLE_INTERRUPTS(r17)
+ l.jr r9
+ l.or r11,r29,r0
+
+ DISABLE_INTERRUPTS(r17,r19)
+ l.lwz r29,0(r4)
+ l.sfgtu r29,r5
+ l.bnf _atomic_min_done
+ l.nop
+
+ l.sw 0(r4),r5
+ ENABLE_INTERRUPTS(r17)
+ l.jr r9
+ l.or r11,r29,r0
+
/* ============================================================[ EOF ]=== */
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openrisc.net/pipermail/linux/attachments/20130214/f54339bc/attachment.html>
Loading...