From patchwork Sat Oct 29 11:58:29 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?UGVra2EgSsOkw6Rza2Vsw6RpbmVu?= X-Patchwork-Id: 80048 Delivered-To: patch@linaro.org Received: by 10.140.97.247 with SMTP id m110csp1614373qge; Sat, 29 Oct 2016 04:59:40 -0700 (PDT) X-Received: by 10.98.69.79 with SMTP id s76mr33051331pfa.178.1477742380866; Sat, 29 Oct 2016 04:59:40 -0700 (PDT) Return-Path: Received: from sourceware.org (server1.sourceware.org. [209.132.180.131]) by mx.google.com with ESMTPS id p17si17291460pge.244.2016.10.29.04.59.40 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 29 Oct 2016 04:59:40 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-return-439884-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) client-ip=209.132.180.131; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org; spf=pass (google.com: domain of gcc-patches-return-439884-patch=linaro.org@gcc.gnu.org designates 209.132.180.131 as permitted sender) smtp.mailfrom=gcc-patches-return-439884-patch=linaro.org@gcc.gnu.org DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type; q=dns; s=default; b=CRUe9UjyGQxe672oms izBCBYt3K0DOJcw0u/Y4hWlArMulLX9AY8WcH+eN0S+XJx9E2QZa/xoQQooRZxFe 9dnehYa7vUvmi6psRGitCSyLnj/VqdyZhSkCtJkS0CH+WY79eF6W1V19QxMf+XcA y27TYMZuwhDFbWfdHxDp0UPWE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type; s=default; bh=vg+YsaKHTB+0PYq7xjoquK1y qho=; b=o3I4AHK+Sp+2hvwED7urEGSfQzGxBi/XhroE7IXPBeNvacjwN1VNxBbc qONr23Pd1SPd5SVeFdPmtamFB8SPn1uvXT8QqmBImHK/ne1dj53qlAw4ucYlNE7q 8K4rpwl2fNE9FCDDcDg37Ymyy7rbFs6Ffj5b+llfAR/cu+bLwIg= Received: (qmail 62897 invoked by alias); 29 Oct 2016 11:59:06 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 62880 invoked by uid 89); 29 Oct 2016 11:59:04 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.6 required=5.0 tests=AWL, BAYES_40, KAM_ADVERT2, RCVD_IN_DNSWL_LOW, UNSUBSCRIBE_BODY autolearn=no version=3.3.2 spammy=1712, LDA, pause, genuinely X-HELO: mail-qk0-f172.google.com Received: from mail-qk0-f172.google.com (HELO mail-qk0-f172.google.com) (209.85.220.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 29 Oct 2016 11:58:53 +0000 Received: by mail-qk0-f172.google.com with SMTP id v138so25057377qka.0 for ; Sat, 29 Oct 2016 04:58:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=iire9AOfSsmjI9fJ6pXg0HCvSehCLlOSRMfMJJVq76M=; b=YkpFfzpXz3/G2VxLnjh/6ps+EDtC2oKeIWFxmqg8ydByFoFDIJ27Hixiv0xiFB4tge +ALGNeidqijVFWZeqg9Q7D4q+/ATZQAxrhQJF0LCRNP/zzz2Z0Flf3ysZVgwKk5noRqP hMllLuTeDHwBMsoaW08Nk2H70qJd4qcKai8eYRLpITVTt78YqGEHlHNvV/a4K6gCFfbl 7jflnDXIt8UqG9zPrvP6cJBbe6YMymk0P/ODQWvKEvJPfaWSNBlqyHRGNPIXvQYW4X1h Ncns6gWo6dO1rHJ6S+OW0gUQC3pQJ6Gn5KnupDaL+f8HWDuqEFAXdUcGgpDn/J3o6pbr HJhg== X-Gm-Message-State: ABUngvercONgiXr6PnSnpjElRa+ZTcvCMYY/FqhQiMe4PECmRpt0wNVYdS9+XbCi4fyl8l64ThPtvkTlDXHwkQ== X-Received: by 10.55.184.2 with SMTP id i2mr16960959qkf.196.1477742331467; Sat, 29 Oct 2016 04:58:51 -0700 (PDT) MIME-Version: 1.0 Received: by 10.237.34.203 with HTTP; Sat, 29 Oct 2016 04:58:29 -0700 (PDT) In-Reply-To: <20161020151044.3ykk44abkjnoxoes@virgil.suse.cz> References: <573A02A5.1020003@parmance.com> <20160801163709.ggc3ppaqh2qfq6g4@virgil.suse.cz> <20161020151044.3ykk44abkjnoxoes@virgil.suse.cz> From: =?UTF-8?B?UGVra2EgSsOkw6Rza2Vsw6RpbmVu?= Date: Sat, 29 Oct 2016 14:58:29 +0300 Message-ID: Subject: Re: [PATCH 2/4] BRIG (HSAIL) frontend: The FE itself. To: =?UTF-8?B?UGVra2EgSsOkw6Rza2Vsw6RpbmVu?= , gcc-patches@gcc.gnu.org Hi Martin, Thanks for the comments and suggestions. Replies inline: On Thu, Oct 20, 2016 at 6:10 PM, Martin Jambor wrote: > - Still quite few things need to be documented better, e.g.: > + brig_to_generic::get_mangled_name_tmpl and to a lesser extent > brig_to_generic::get_mangled_name. It should be clear what is the > intended difference in usage of the two (specially since the > former is a template and so the parameter does not give that much > of a hint to the reader) Added more comments. > + the visitor classes need some description so that the first time > readers see them, they understand what they are for and what they > visit (i.e. what "visiting" even means). This is an adaptation of the classic gang of four Visitor design pattern. I added a reference to it in a comment. > - I know it was me who told you to use gcc_assert and gcc_unreachable > instead of internal_error. The thing is, often the error is clearly > not an internal error but an error in input. I think that we should > plan to handle these cases differently and report the issues better > to the user, give a meaningful error message together width the > section and the offset there when it was encountered. I am not > asking you to audit all asserts now and convert those in this > category but it would be nice to have a mechanism to easily do so > (and convert a few obvious places), so that we can convert these as > we bump into them. I'm not sure about this. BRIG FE is a rather special case as we assume HSAILasm has been used to parse and error check the original HSAIL text to the binary BRIG format which it consumes. Of course HSAILasm can have bugs, but how much we should produce human readable error messages to help debugging HSAILasm is another thing. In case the BRIG FE fails to consume the input, it means either the BRIG is corrupted for a reason or another, but typically is not a human error (those should be caught be HSAILasm). "File format not recognized" error is one that might be useful though. I added a check for the BRIG magic number and the supported version (1.0). Perhaps we should add error printouts later on case by case basis when we see which error cases can be useful and worth reporting in a human readable graceful manner? It can be as easy as converting the internal_error to fatal_error or similar in that case. > - A very minor suggestion: In GCC it is customary to write TODO as one > word. We generally do not use "TO OPTIMIZE", that is just a TODO > (as opposed to a FIXME, which hints that something is at least a bit > wrong here). I think you can keep your way if you want but for > example I do have emacs highlighting set up for the traditional > formats. Converted all to TODO. > - You do not seem to handle BRIG_OPCODE_ICALL, or have I missed it? > That is fine, I don't think anybody else does that now anyway, I > just got curious reading the ipa analysis. Right. > - brig-lang.c: > + x_flag_whole_program = 0; - talk about this with Honza I guess this is note to yourself? As we agreed I didn't try whole program optimizations yet. I might later when optimizing for a target. It will be really useful, I agree. Now that it used the proper builtin way, it might work more easily. > + brig_langhook_type_for_size: has several issues. Either always > return a type like go or return error_mark_node instead of NULL. > Also, do not count on long being 64-bit. I would just copy what > go does. Or lto. Copied the go's version, it should work with BRIG too. > + brig_langhook_type_for_mode: Also please do not depend on knowing > that long is 8 bytes, or int being 4 bytes long. For complex > modes, the correct thing seems to be to return NULL instead of > void_type_node. In any case, it would be better to return > error_mark_node rather than void_type_node. Fixed. > + convert: did you avoid using convert_to_vector deliberately? The > size check seems genuinely useful. BRIG/HSAIL is a bit special case due to its "untyped" variables (registers), I use bit level casts in a lot of places to avoid accidental type conversions or sign extensions. > - brigfrontend/brig-to-generic.cc: > + brig_to_generic::parse: You seem to be handling LDA instruction > (or more generally, BRIG_KIND_INST_ADDR instructions, but there is > just the one) with copy_move_inst_handler, which seems just wrong. What do you mean by 'wrong'? The handlers do not map to instruction types directly, but often more towards the specification chapters. It comes straight from: http://www.hsafoundation.com/html/HSA_Library.htm#PRM/Topics/18_BRIG/BRIG_syntax_copy_move.htm > Its operator() blindly casts the instruction to > BrigInstSourceType, interpreting the segment as if it was a source > type... am I missing something? I believe you are confused by the BRIG struct name: BrigInstSourceType is a struct for "instructions that have different types for their destination and source operands" http://www.hsafoundation.com/html/HSA_Library.htm#PRM/Topics/18_BRIG/BrigInstSourceType.htm That is, it's not a "source type" object, but an instruction type object. > + build_reinterpret_cast: gcc_unreachable followed by return > NULL_TREE does not make sense, please convert the whole thing to > an assert. More importantly, I think that you really need to Converted it to an assert. > solve the case of mismatching sizes differently, and not create a > (single) V_C_E for it. For register types, you can V_C_E them to > an unsigned int of the same size, then do a widening NOP_EXPR to > unsigned type with the same size as the destination and then do a > V_C_E to whatever you need. But perhaps this can be solved better > at the caller side somewhere? I added the extra conversion step through unsigned ints. I hadn't noticed that V_C_E is not guaranteed to work for this case. > + brig_to_generic::finish_function: the ifdef'ed debug_functions > should not be a part of the final submission. Checking Removed. > m_cf->m_is_kernel twice looks ugly. Removed the extra check. > + brig_to_generic::append_group_variable: Don't put a statement > guarded with an if statement on the same line as the condition. I > also believe align_padding, when the offset is not already > aligned, should be: > alignment - m_next_group_offset % alignment > + brig_to_generic::append_private_variable: The same as above. Good catch! > + I would suggest that you change brig_to_generic::dump_function to > a standalone function taking a file parameter. It is then much > easier to call it for example from debugger. Done. > - brig-code-entry-handler.cc > + brig_code_entry_handler::build_tree_operand: The unreachable > return at the end looks misleading, just removing it should be > fine. Ditto for the breaks after returns, the upcoming > fall-through warning will notify us if we ever get a potentially > wrong fall-through. Done. > + brig_code_entry_handler::get_tree_cst_for_hsa_operand: GCC coding > standard mandates that if a branch has only one statement in it, > it should not be encapsulated in braces. Done. > If, in the condition > > if (type != NULL && TREE_CODE (type) == ARRAY_TYPE) > > type can ever actually be NULL, then the function will segfault > just before ending, when again checking whether type is an array. > If it cannot be NULL, then please gcc_checking_assert it instead. It cannot be NULL anymore at that point. Removed the misleading check. > + brig_code_entry_handler::build_address_operand: I must say I > really dislike how the end of the function is structured, it is > terrible difficult to read given that it is not doing anything that > complicated and I think it does not handle correctly an (LDA of a) > NULL address, which unfortunately I believe is valid for private > and group segment addresses. Dealing with the most complex case > by converting symbol to size_type looks exactly backwards, > especially given that you have converted the base of the > POINTER_PLUS_EXPR only few lines before. I think the code would > be a lot nicer and easier to comprehend if you clearly > distinguished the various cases (symbol_base != NULL (and sub > cases when ptr_base is or is not NULL), ptrbase != NULL and simple > constant, even NULL constant, which you do not handle but fail an > assert, I think) and handled them separately, including all type > conversions. ptr_base is an unfortunate name, IMHO, in many cases > it has the role of a variable offset rather than a base. > Similarly, ptr_ofset is really a constant_offset. Renamed ptr_base to var_offset and ptr_offset to const_offset which indeed are more descriptive. I also cleaned the if..else mess at the end of the function and made the different cases very explicit, instead of the spaghetti mess previous version. It looks much better now to me at least. > + brig_code_entry_handler::get_tree_type_for_hsa_type: I believe it > is better to and the type with BRIG_TYPE_PACK_MASK if you want to > determine whether you are looking at a vector (packed) HSAIL Done. > instruction. I also think that putting that test into a separate > function and calling it from all places when you do this would be > more future-proof (something like hsa_type_packed_p in gcc/hsa.c). I just used that function instead and called if from everywhere. > For the (inner_brig_type == BRIG_TYPE_F16) vector case, you do not > end up calling build_type_variant (why is it necessary in any > case?) but for other vector types you do. Is that intentional? I don't remember why the const stripping code was there. Maybe a leftover of some earlier hack/workaround in a previous gcc version. I removed it and it seems to work fine. > + brig_code_entry_handler::expand_or_call_builtin: If I am correct > that the operands parameter contains only input operands at this > point, please state so in the function comment. Since we only > allow ourselves 80 characters wide code, it is customary not to > put code into else branches if the if branch returns (or > breaks/continues) anyway. Done. > Given that this function builds vectors only to pile them > element-wise to arguments of variable-argument-length function > call_builtin, which then builds vectors out if its > elements... have you considered having an overloaded > implementation of call_builtin that would not do this? It seems > particularly wasteful. Optimized by inlining the essential builtin build code to the call site. > + brig_code_entry_handler::build_operands: Please make the return > type consistent with the one in class definition (tree_stl_vec > instead of explicit std::vector, assuming you prefer the > former). Done. > GCC coding standard mandates that if a branch has only one > statement in it, it should not be encapsulated in braces. Cleaned up. > Please replace > if (operand == NULL_TREE) > gcc_unreachable (); > with gcc_assert (operand); Done. > Please rewrite conditions like !(TREE_CODE (operand) == TREE_VEC) > as (TREE_CODE (operand) != TREE_VEC) Did. > Again you are creating VIEW_CONVERT_EXPR for an operand that is of > a different size than the result type. For scalar types, this is > bound to cause trouble sooner or later and I really think you need > to avoid it. Fixed by delegating the conversion to the (now updated) build_reinterpret_cast(). > - brig-basic-inst-handler.cc: > + scalarized_sat_arithmetics::scalarized_sat_arithmetics: Do not > undef macros preemptively. Instead, undef them right after using > them, after the include of a .def file. This seems to be an idiom with the builtin import mechanism elsewhere also. builtins.def defines a default macro which one must undef if not wanting to do anything with that builtin type in that particular import location. > + brig_basic_inst_handler::must_be_scalarized: I am intrigued by the > (elements < 16) part of the condition. This function would also > benefit from a comment. This function black listed the known cases of MULHI with vectors that have been tested to break with AMD64/x86-64. It seems the support for vector MULT_HIGHPART_EXPR is flaky and undertested. The robust thing to do here is to force scalarization always with MULHI for now, until these issues are debugged further. I added an exception for 2x64b MULT_HIGHPART_EXPR to avoid the need for 128b scalar arithmetics, and as it seems to work for the CPUs I've tested. The decision should not IMO belong to the frontend, but there'd be better a step where the vector operations are optionally scalarized if the target prefers scalarized operations which should be caught by this step also. Something to fix during optimization work. > + scalarized_sat_arithmetics::builtin - please prefix with m_, ditto > for brig_inst_ (why the trailing underscore?) Done. > + brig_basic_inst_handler::must_be_scalarized needs a comment > explaining what it is for. Removed the method as unneeded for now (see the comment above about MULHI). > + brig_basic_inst_handler::get_raw_type: Unless "raw" is some HSA > term I have missed, I would strongly suggest that you rename the > function to something more immediately obvious, like uint_for_type > or something like that. Also, don't use literal 8 but Renamed to get_unsigned_int_type(). > BITS_PER_UNIT. Also, do we really need both this function and Done. > brig_code_entry_handler::get_raw_tree_type ? Nope. > + build_shuffle, build_unpack, build_pack, build_unpack_lo_or_hi and > build_lower_element_broadcast: I admit that so far I have only > very briefly skimmed through these functions. In any way, use > BITS_PER_UNIT instead of 8. Done. > + brig_basic_inst_handler::build_instr_expr: Please remove the 'r' > and make it build_inst_expr for the sake of consistency. If I Done. > understand the code correctly, the operands parameter contains > only input operands. In that case, please state so in the > function comment and remove the local variable first_input, it has > no purpose but to confuse. Also, please move the definition and Done. > assignments to local variable input_count (and possibly also > output_count) down to where it is used. Done. > + brig_basic_inst_handler::operator (): It seems that the opcode > local variable is only used to identify the return brig > instructions which seems wasteful. Generally, it would be nice to It's also used to catch MULHI which is generated from multiple brig opcodes. > clean this function up a little by moving assignments to some of > the very many local variables down, as close to their first use as > reasonable. Surprisingly often, you'd remove the need to compute > them in many cases at all, e.g. look at element_count and > element_size_bits. Moved element_count and is_fp16_operation. element_size_bits is now used for catching mulhis for 64b elements. > Extra points for a function comment explaining how work is divided > in between operator() itself and its main helpers such as > build_instr_expr. I added a method comment, but the truth is that the division of work is a bit artificial, mostly the build_inst_expr() call is there to split a complex if..else structure to two functions to improve readability. > + brig_basic_inst_handler::get_tree_code_for_hsa_opcode: The comment > says the special value returned when it is necessary to use a > chain of tree expressions or an builtin is NULL_TREE, but the > function itself returns TREE_LIST or CALL_EXPR. Corrected. > - brig-cmp-inst-handler.cc: > + brig_cmp_inst_handler::operator (): the neg_expr seems to be > something left from earlier times? Use BITS_PER_UNIT instead of Correct. Removed. > 8, having both result_width and element_width seems unnecessary Removed element_width and moved result_width definition closer to its use. > (and speaking of elements, is that actually even a vector case?), > and should be initialized only in the case when it is used. In > case of vector results, please build either all_ones or all_zeros, > it is wasteful to allocate both. They both are used to produce the HSA required all_ones/all_zeros output. > - brig-mem-inst-handler.cc: I believe that using the alignment > modifier is something that we should try to get done as soon as > possible. I agree. Probably one of the first things that will pop up during optimizing the performance. > - brig-inst-mod-handler.cc: This seems like something that we should > at least warn about (in case when effectively an unsupported > operation is requested). If there will be an upgrade for the frontend to support the 'full' profile (it's only supporting 'base' now) with all the rounding modifiers, a better way might be found than injecting fesetround() calls around all float expressions. Probably in that case all float ops must be converted to builtin calls that ensure the wanted rounding (ungh!). > - brig-seg-inst-handler.cc: At this point I'm trying to read quickly > but it seems to me you do not support conversion between flat and > global segment... how come? Global address is already a flat address. Check the description tab in http://www.hsafoundation.com/html/HSA_Library.htm#PRM/Topics/05_Arithmetic/segment_conversion.htm?Highlight=global > - brig-copy-move-inst-handler.cc: > + brig_copy_move_inst_handler::operator (): The function definitely > should not cast LDA instructions to BrigInstSourceType*! Yes it should. Like I explained above, the struct name is misleading, but it's actually an instruction calls not a source type. > - brig-branch-inst-handler.cc: I believe that as long as the builtins > representing barriers are not pure, they will not be hoisted out of > a loop. Nonduplication might indeed be a problem, although short of > whole function cloning, I could not think of a transformation gcc > performs that might pause a problem. Nevertheless, we probably > should introduce an attribute for it and look for it in > gimple_can_duplicate_bb_p (and in cfg_layout_can_duplicate_bb_p?). > An important issue, but hopefully for later. Agreed. > - brig-variable-handler.cc: > + brig_directive_variable_handler::operator (): Please use > BITS_PER_UNIT instead of 8. Done. > + build_variable: Likewise. I am a bit concerned that unlike in > operator(), you do not make the alignment at least as big as > natural one, which means that in theory (and probably only on > malformed BRIG, I suppose), the two functions might disagree about > alignment? I think it would be nice to outline the extraction of > alignment to an independent function and use that from both > places. Done. > - brig-function-handler.cc: > + brig_directive_function_handler::operator: Please use gcc_assert > instead of assert. (Well, in this case it is clearly input error > which, eventually, we will want to give nice errors about. But at > least do not use assert.) Converted to gcc_assert() for now. > - brig-function.cc: > + brig_function::analyze_calls: The first if condition should be > terminated by a newline Done. > + brig_function::add_wi_loop: Is the second TODO now obsolete? Yes, removed. > + brig_function::build_launcher_and_metadata: The ASM directive is > really an ugly hack. It is isolated so I am not that much > concerned, but building a structure and filling it with data (like > we do for example in hsa_output_libgomp_mapping) seems cleaner. Hmm. I don't generate any metadata structs to data section, but a separate custom ELF section per kernel. I agree the ASM directive is not ideal in general, but I don't think there's a generic way to add custom ELF sections. I'm not sure how much building the structure field by field would be a better approach in comparison to a raw dump as the point is to just transfer the metadata to the HSA runtime with the finalized binary. The runtime should use exactly the same struct layout, otherwise it won't work anyways. If I do a raw dump, at least I ensure that if the struct is updated I won't forget to update the serialization code. > - brig-util.cc: > + gccbrig_is_raw_operation: I think that calling the operations > "bit" operations instead of "raw" would make life of readers of > the code slightly but noticeably easier. Renamed. > + gccbrig_hsa_type_bit_size: If possible, please make the default > case be gcc_unreachable(). (If zero is expected in some cases, > then all callers should check for it, so that we for example do > not divide by zero in > brig_code_entry_handler::get_tree_type_for_hsa_type.) Made it call gcc_unreachable (). > + might_be_host_defined_var: there is no need for the returned > expression to start on a new line. Fixed. > - gcc/builtins.def: In the added DEF_HSAIL_*_BUILTIN macros, please > arrange it so that they only pass true in the last argument of > DEF_BUILTIN when gcc is configured with BRIG FE. Builtins are not > free and should not be added needlessly. It doesn't even include the hsail-builtins.def in case BRIG FE not enabled now. I suppose that's even better than executing those macros for nothing. > Overall, the code has improved significantly. As far as I am > concerned, the only real issue I see are the VIEW_CONVERT_EXPRs with > mismatched operands. They are asking for trouble, only Ada produces > those (although it is acknowledged it should not) and Ada only does it > for aggregates. These should be now fixed. > If I understood you correctly, both you and your sponsor have already > signed the Copyright assignment, right? If that is so, I'll ask the > steering committee to approve the intention and then ask a global > reviewer to also peek at it. Correct, and I see you already did. Thanks! > Thanks for your patience, Thank a lot for the comments. I know how much patience it requires to wade through a big bunch of someone else's boring code. New BRIG FE patch set attached. BR, Pekka This patch set adds a BRIG (HSAIL) frontend. It can be used as a core for an HSAIL finalizer implementation for processors with gcc backends. It is a bit unusual frontend as the consumed format is a binary representation. The textual HSAIL can be compiled to it with a separate assembler tool. The frontend has been mostly tested with the HSA PRM conformance suite which it now passes. The accompanied GENERIC-scanning test suite is supposed to be only a smoke test. libhsail-rt implements HSAIL specific builtins and includes a simple runtime that implements SPMD execution via either Pth-based fibers or loops to execute multiple work-item work groups without SPMD/SIMD-default hardware. I've split it to 4 patches: 001 - the configuration file changes and misc. 002 - the frontend itself 003 - libhsail-rt 004 - the smoke test suite The diffstat is as follows: .gitignore | 2 +- Makefile.def | 3 + Makefile.in | 489 + configure | 1 + configure.ac | 1 + gcc/brig/Make-lang.in | 247 + gcc/brig/brig-builtins.h | 99 + gcc/brig/brig-c.h | 66 + gcc/brig/brig-lang.c | 770 + gcc/brig/brigfrontend/brig-arg-block-handler.cc | 66 + gcc/brig/brigfrontend/brig-atomic-inst-handler.cc | 265 + gcc/brig/brigfrontend/brig-basic-inst-handler.cc | 865 + gcc/brig/brigfrontend/brig-branch-inst-handler.cc | 221 + gcc/brig/brigfrontend/brig-cmp-inst-handler.cc | 198 + gcc/brig/brigfrontend/brig-code-entry-handler.cc | 1712 ++ gcc/brig/brigfrontend/brig-code-entry-handler.h | 422 + gcc/brig/brigfrontend/brig-comment-handler.cc | 39 + gcc/brig/brigfrontend/brig-control-handler.cc | 108 + .../brigfrontend/brig-copy-move-inst-handler.cc | 58 + gcc/brig/brigfrontend/brig-cvt-inst-handler.cc | 260 + gcc/brig/brigfrontend/brig-fbarrier-handler.cc | 44 + gcc/brig/brigfrontend/brig-function-handler.cc | 373 + gcc/brig/brigfrontend/brig-function.cc | 719 + gcc/brig/brigfrontend/brig-function.h | 213 + gcc/brig/brigfrontend/brig-inst-mod-handler.cc | 58 + gcc/brig/brigfrontend/brig-label-handler.cc | 37 + gcc/brig/brigfrontend/brig-lane-inst-handler.cc | 84 + gcc/brig/brigfrontend/brig-machine.c | 44 + gcc/brig/brigfrontend/brig-machine.h | 33 + gcc/brig/brigfrontend/brig-mem-inst-handler.cc | 180 + gcc/brig/brigfrontend/brig-module-handler.cc | 41 + gcc/brig/brigfrontend/brig-queue-inst-handler.cc | 93 + gcc/brig/brigfrontend/brig-seg-inst-handler.cc | 146 + gcc/brig/brigfrontend/brig-signal-inst-handler.cc | 42 + gcc/brig/brigfrontend/brig-to-generic.cc | 812 + gcc/brig/brigfrontend/brig-to-generic.h | 226 + gcc/brig/brigfrontend/brig-util.cc | 446 + gcc/brig/brigfrontend/brig-util.h | 53 + gcc/brig/brigfrontend/brig-variable-handler.cc | 263 + gcc/brig/brigfrontend/phsa.h | 69 + gcc/brig/brigspec.c | 135 + gcc/brig/config-lang.in | 41 + gcc/brig/lang-specs.h | 28 + gcc/brig/lang.opt | 41 + gcc/builtin-types.def | 80 +- gcc/builtins.def | 41 + gcc/config.in | 6 + gcc/configure | 10 +- gcc/configure.ac | 5 + gcc/doc/frontends.texi | 2 +- gcc/doc/invoke.texi | 4 + gcc/doc/standards.texi | 8 + gcc/hsail-builtins.def | 659 + gcc/testsuite/brig.dg/README | 10 + gcc/testsuite/brig.dg/dg.exp | 27 + gcc/testsuite/brig.dg/test/gimple/alloca.hsail | 37 + gcc/testsuite/brig.dg/test/gimple/atomics.hsail | 33 + gcc/testsuite/brig.dg/test/gimple/branches.hsail | 58 + gcc/testsuite/brig.dg/test/gimple/fbarrier.hsail | 74 + .../brig.dg/test/gimple/function_calls.hsail | 59 + gcc/testsuite/brig.dg/test/gimple/kernarg.hsail | 25 + gcc/testsuite/brig.dg/test/gimple/mem.hsail | 39 + gcc/testsuite/brig.dg/test/gimple/mulhi.hsail | 33 + gcc/testsuite/brig.dg/test/gimple/packed.hsail | 78 + .../brig.dg/test/gimple/smoke_test.hsail | 91 + gcc/testsuite/brig.dg/test/gimple/variables.hsail | 124 + gcc/testsuite/brig.dg/test/gimple/vector.hsail | 57 + gcc/testsuite/lib/brig-dg.exp | 29 + gcc/testsuite/lib/brig.exp | 40 + include/hsa-interface.h | 630 + libhsail-rt/Makefile.am | 124 + libhsail-rt/Makefile.in | 740 + libhsail-rt/README | 4 + libhsail-rt/aclocal.m4 | 978 + libhsail-rt/config.h.in | 217 + libhsail-rt/configure | 17016 ++++++++++++++++++ libhsail-rt/configure.ac | 151 + libhsail-rt/include/internal/fibers.h | 95 + .../include/internal/phsa-queue-interface.h | 60 + libhsail-rt/include/internal/phsa-rt.h | 94 + libhsail-rt/include/internal/workitems.h | 107 + libhsail-rt/m4/libtool.m4 | 7997 ++++++++ libhsail-rt/m4/ltoptions.m4 | 384 + libhsail-rt/m4/ltsugar.m4 | 123 + libhsail-rt/m4/ltversion.m4 | 23 + libhsail-rt/m4/lt~obsolete.m4 | 98 + libhsail-rt/rt/arithmetic.c | 475 + libhsail-rt/rt/atomics.c | 115 + libhsail-rt/rt/bitstring.c | 190 + libhsail-rt/rt/fbarrier.c | 87 + libhsail-rt/rt/fibers.c | 212 + libhsail-rt/rt/fp16.c | 135 + libhsail-rt/rt/misc.c | 89 + libhsail-rt/rt/multimedia.c | 135 + libhsail-rt/rt/queue.c | 71 + libhsail-rt/rt/sat_arithmetic.c | 299 + libhsail-rt/rt/segment.c | 57 + libhsail-rt/rt/workitems.c | 952 + libhsail-rt/target-config.h.in | 68 + 99 files changed, 43463 insertions(+), 5 deletions(-)