For a function/method contains many input parameters, does it make a difference if passing-in in different orders? If does, in what aspects (readability, efficiency, ...)? I am more curious about how should I do for my own functions/methods?
It seems to me that:
Parameters passing by references/pointers often come before parameters passing by values. For example:
void* memset( void* dest, int ch, std::size_t count );
Destination parameters often come before source parameters. For example:
void* memcpy( void* dest, const void* src, std::size_t count );
Except for some hard constraints, i.e., parameters with default values must come last. For example:
size_type find( const basic_string& str, size_type pos = 0 ) const;
They are functional equivalent (achieve the same goal) no matter what order they pass in.
There are a few reasons it can matter - listed below. The C++ Standard itself doesn't mandate any particular behaviours in this space, so there's no portable way to reason about performance impact, and even if something's demonstrably (slightly) faster in one executable, a change anywhere in the program, or to the compiler options or version, might remove or even reverse the earlier benefit. In practice it's extremely rare to hear people talk about parameter ordering being of any significance in their performance tuning. If you really care you'd best examine your own compiler's output and/or benchmark resultant code.
Exceptions
The order of evaluation of expressions passed to function parameters is unspecified, and it's quite possible that it could be affected by changes to the order they appear in the source code, with some combinations working better in the CPU execution pipeline, or raising an exception earlier that short-circuits some other parameter preparation. This could be a significant performance factor if some of the parameters are temporary objects (e.g. results of expressions) that are expensive to allocate/construct and destruct/deallocate. Again, any change to the program could remove or reverse a benefit or penalty observed earlier, so if you care about this you should create a named temporary for parameters you want evaluated first before making the function call.
Registers vs cache (stack memory)
Some parameters may be passed in registers, while others are pushed on to the stack - which effectively means entering at least the fastest of the CPU caches, and implies their handling may be slower.
If the function ends up accessing all the parameters anyway, and the choice is between putting parameter X in a register and Y on the stack or vice versa, it doesn't matter much how they're passed, but given the function may have conditions affecting which variables are actually used (if statements, switches, loops that may or may not be entered, early returns or breaks etc.), it's potentially faster if a variable that's not actually needed was on the stack while one that was needed was in a register.
See http://en.wikipedia.org/wiki/X86_calling_conventions for some background and information on calling conventions.
Alignment and padding
Performance could theoretically be affected by the minutae of parameter passing conventions: the parameters may need particular alignment for any - or perhaps just full-speed - access on the stack, and the compiler might choose to pad rather than reorder the values it pushes - it's hard to imagine that being significant unless the data for parameters was on the scale of cache page sizes
Non-performance factors
Some of the other factors you mention can be quite important - for example, I tend to put any non-const pointers and references first, and name the function load_xxx, so I have a consistent expectation of which parameters may be modified and which order to pass them. There's no particularly dominant convention though.